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Abstract 

Self-adjusting computation offers a language-based ap- 
proach to writing programs that automatically respond to 
dynamically changing data. Recent work made significant 
progress in developing sound semantics and associated im- 
plementations of self-adjusting computation for high-level, 
functional languages. These techniques, however, do not ad- 
dress issues that arise for low-level languages, i.e., stack- 
based imperative languages that lack strong type systems 
and automatic memory management. 

In this paper, we describe techniques for self-adjusting 
computation which are suitable for low-level languages. 
Necessarily, we take a different approach than previous 
work: instead of starting with a high-level language with ad- 
ditional primitives to support self-adjusting computation, we 
start with a low-level intermediate language, whose seman- 
tics is given by a stack-based abstract machine. We prove 
that this semantics is sound: it always updates computations 
in a way that is consistent with full reevaluation. We give a 
compiler and runtime system for the intermediate language 
used by our abstract machine. We present an empirical eval- 
uation that shows that our approach is efficient in practice, 
and performs favorably compared to prior proposals. 

1. Introduction 

Many applications operate on data that changes incremen- 
tally, i.e., by a small amount, over time. Such incremental 
changes often require only incremental updates to the out- 
put, making it possible to respond to dynamically chang- 
ing data more efficiently than recomputing the output from 
scratch. These improvements are often asymptotically sig- 
nificant, providing as much as a linear factor of speedup. 
To exploit this potential, one can develop "dynamic" or "ki- 
netic" algorithms that are designed to deal with particular 
forms of changing input by taking advantage of the particu- 



lar structure of the problem at hand ifTOlfTSlfTSl . This manual 
approach often yields updates that are asymptotically faster 
than full reevaluation, but carries inherent complexity and 
non-compositionality that makes the algorithms difficult to 
design, analyze, and use. 

As an alternative to manual design of dynamic and ki- 
netic algorithms, the programming languages community 
has developed techniques that either automate or mostly au- 
tomate the process of translating an implementation of an 
algorithm for fixed input into a version for changing in- 
put. This is a challenging problem because the compiler is 
expected to improve the asymptotic complexity of the pro- 
gram. Many different approaches have been considered; for 
more detail on previous work we refer the reader to Rama- 
lingam and Reps' survey f33l and to Section 10 Recent ad- 
vances on self-adjusting computation [3| made substantial 
progress on this problem by proposing techniques that allow 
both purely functional and imperative programs to automat- 
ically respond to changes in their data. The approach has 
been shown to be effective in a reasonably broad range of 
areas including computational geometry, invariant checking, 
motion simulation, and machine learning (e.g., |4, 6, 34J) 
and has even helped solve challenging open problems |[8l. 

Self-adjusting computation typically relies on program- 
mer help to identify the data that can change over time, 
called changeable data, and the dependencies between this 
data and program code. This changeable data is typically 
stored in special memory cells referred to as modifiable ref- 
erences (modifiables for short), so called because they can 
undergo incremental modification. The read and write de- 
pendencies of modifiables are recorded in a dynamic execu- 
tion trace (or trace, for short), which effectively summarizes 
the self-adjusting computation. When modifiables change, 
the trace is automatically edited through a change propaga- 
tion algorithm: some portions of the trace are reevaluated 
(when the corresponding subcomputations are affected by 
a changed value), some portions are discarded (e.g., when 
reevaluation changes control paths) and some portions are 
reused (when a subcomputation remains unaffected, i.e., 
when it remains consistent with the values of modifiables). 
We typically say that a semantics for self-adjusting com- 
putation is sound (alternatively, consistent), if the change 
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propagation mechanism always yields a result consistent 
with full reevaluation. 

The initial approaches for self-adjusting computation of- 
fer programming interfaces within existing functional lan- 
guages, namely, SML and Haskell, either via a library Il7jil4 | 
or with special compiler support ll27l . However, in all these 
systems, self-adjusting programs have a purely-functional 
flavor, as modifiables must be written exactly once. Later, 
Acar et al. lifted this write-once restriction by giving a 
higher-order imperative semantics for self-adjusting com- 
putation |5l. 

Unfortunately, this imperative semantics is not well- 
suited for modeling low-level languages — by low-level we 
mean (here and throughout) stack-based imperative lan- 
guages that lack strong type systems and automatic mem- 
ory management. First, the imperative semantics assumes 
that only modifiables are mutable: all other data is implic- 
itly assumed to be immutable. While a strong type system 
can enforce this policy, in a low-level setting, all data is 
mutable by default, and there is no strong type system to 
enforce other policies. Next, the imperative semantics im- 
plicitly assumes that all garbage is collected automatically. 
This includes garbage from the self-adjusting program itself, 
as well as from updating its trace via change propagation. 
Such automatic collection cannot be assumed for low-level 
languages. Finally, and perhaps most importantly, the imper- 
ative semantics provides no account of how execution traces 
should be incrementally edited by the system for reuse. In- 
stead, the semantics effectively relies on an oracle to gener- 
ate reusable traces, and leaves the internal behavior of this 
oracle unspecified. Consequently, the oracle hides many of 
the pratical issues that would otherwise arise, such as how 
memory allocation and collection interact with trace reuse. 

Based on their imperative semantics, Acar et al. describe 
a library -based implementation for SML |i5J. Following this 
library interface, CEAL ||23]| provides compiler support to 
write self-adjusting computations in C. However, because of 
the issues raised above, the soundness property proven for 
the semantics generally does not hold for CEAL programs 
unless they adhere to various correct-usage restrictions. In 
particular, CEAL programs must only mutate modifiables 
and local variables — global variables, return value^ and 
user-defined data structures must be immutable (and hence, 
non-modifiable). Furthermore, since even immutable data 
must first be initialized in a low-level setting, and since this 
initialization is itself a case of mutation, CEAL programs are 
required to treat such initialization code in a special way. 
Namely, they must separate it into designated "initialization 
functions", as introduced in previous work on automatic 
memory management for self-adjusting computation [21] . 

Failing to follow the correct-usage restrictions given 
above, a CEAL program could crash, or alternatively, fail 
to provide correct updates. As a simple example, consider 



The imperative semantics restricts return types to unit (i.e., void). 



a trivial program that calls two functions: the first copies 
some input from modifiable mi,, to a global variable g; the 
second copies the value of g into another modifiable rriout 
as output. The computational dependencies of modifiable 
references min and rriout are traced, but those of global vari- 
able g are not. Consequently, when changes, mout wiU 
not be updated, since doing so requires knowledge of its de- 
pendency on g. An analogous scenario can be constructed 
using any non-modifiable memory in place of global g (e.g., 
a user-defined data type). 

At present, we are aware of no generally sound imple- 
mentation of self-adjusting computation for low-level lan- 
guages, nor a semantics that suggests one. 

Self-adjusting stack machines. In this paper, we describe 
techniques for sound self-adjusting computation which are 
suitable for low-level languages. To achieve soundness with- 
out losing generality, we take a fundamentally different ap- 
proach than previous work: instead of starting with a high- 
level language with additional primitives to support self- 
adjusting computation, we start with a low-level intermedi- 
ate language called I L. 

We give two semantics to IL by defining two abstract 
machines: the reference machine models conventional eval- 
uation semantics, while the tracing machine models self- 
adjusting semantics. Each machine is defined by a transition 
relation between machine configurations. Our low-level set- 
ting is reflected by the reference machine's configurations: 
each consists of a store, a stack, an environment and a pro- 
gram. The tracing machine extends these configurations with 
an execution trace. We define traced evaluation and change 
propagation within the tracing machine by including tran- 
sitions that incrementally edit the trace (i.e., transitions that 
either insert, remove or replay traced execution steps). We 
show that automatic memory management is a natural as- 
pect of automatic change propagation by defining a notion 
of garbage collection. 

Contributions. Our contributions are as follows: 

L We provide an abstract machine semantics for self- 
adjusting computation. This includes accounts of how 
change propagation interacts with a control stack, with 
return values and with memory management. We prove 
that this semantics is sound. 

2. We describe and implement a compiler and runtime sys- 
tem for I L, the intermediate language used by our abstract 
machines. Additionally, we give two automatic optimiza- 
tions to reduce the overhead of the approach. 

3. We describe and implement a front-end that translates 
a large subset of C into IL, and perform an empirical 
evaluation of our implementation. 

2. Overview 

We introduce the challenges for giving self-adjusting com- 
putation support to programs written in low-level languages. 
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In particular, we consider two example programs and con- 
sider strategies for incrementally updating their computa- 
tions. We introduce our approach, in which we restructure 
these programs in IL, our intermediate language for self- 
adjusting computation. We informally describe a change 
propagation semantics for IL programs that addresses the 
challenges from the examples. 

2.1 Example 1: Reducing Trees 

For our first example, we consider a simple evaluator for ex- 
pression trees, as expressed with user-defined C data struc- 
tures. These expression trees consist of integer-valued leaves 
and internal nodes that represent the binary operations of ad- 
dition and subtraction. Figure [T] shows their representation 
in C. The tag field (either LEAF or BINOP) distinguishes be- 
tween the leaf _val and binop fields of the union u. Fig- 
ure|2]gives a simple C function that evaluates these trees. 

Suppose we first run eval with an expression tree as 
shown on the left in Figure |3] evaluating ((3 + 4) — 0) + 
(5 — 6), the execution will return the value 6. Suppose we 
then change the expression tree to ((3 + 4) — 0) + ((5 — 
6)+5) as shown in Figure[3]on the right. How shall change 
propagation efficiently update the output? 

Strategy for change propagation. We first consider the 
computation's structure, of which Figure|4]gives a summary: 
the upper and lower versions summarize the computation be- 
fore and after the change, respectively. Their structure re- 
flects the stack behavior of eval, which divides each invo- 
cation into (up to) three fragments: Fragment one checks the 
tag of the node, returning the leaf value, if present, or else 
recurring on the left subtree (lines 2-5); fragment two recurs 
on the right subtree (line 6); and fragment three combines 
and returns the results (lines 7-8). 

In Figure |4j each fragment is labeled with a tree node, 
e.g., b2 represents fragment two's execution on node b. The 
dotted horizontal arrows indicate pushing a code fragment 
on the stack for later Solid arrows represent the flow of 
control from one fragment to the next; when diagonal, they 
indicate popping the stack to continue evaluation. 

Based on these two computations' structure, we infor- 
mally sketch a strategy for change propagation. First, since 
the left half of the tree is unaffected, the left half of the 
computation (ai-bs) is also unaffected, and as such, change 
propagation should reuse it. Next, since the right child for 
a has changed, the computation that reads this value, frag- 
ment a2, should be reevaluated. This reevaluation recurs to 
node g, whose subtree has not changed. Hence, change prop- 
agation should reuse the corresponding computation (gj- 
gg), including its return value, —1. Comparing jj^-j 3 against 
gj^-gg, we see that a's right subtree evaluates to 4 rather 
than —1. Hence, change propagation should reevaluate as, 
to yield the new output of the program, 11. 

Challenges. For change propagation to use the strategy 
sketched above, it must identify dependencies among data 



typedef struct node_s* node_t ; 




struct node_s { 




enum { LEAF, BINOP } tag; 




union { int leaf_val; 




struct { enum { PLUS, 


MINUS } op; 


node_t left, 


right; } binop; 


} n; }; 





Figure 1. Type declarations for expression trees in C. 



1 int eval (iiode_t root) { 

2 if (root->tag == LEAF) 

3 return root->u.leaf _val; 

4 else { 

5 int 1 = eval (root->u .binop . left) ; 

6 int r = eval (root->u. binop. right) ; 

7 if (root->u. binop. op == PLUS) return (1 + r) ; 

8 else return (1 - r) ; 

9 } } 



Figure 2. The eval function in C. 
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Figure 3. Example expression trees. 




Figure 4. Example execution traces of eval. 



1 


let eval (root) = memo 


2 


let evaljright (1) = 


3 


let eval_op (r) = update 


4 


let op = read (root [OP] ) in 


5 


if (op == PLUS) then pop (1+r) 


6 


else pop (1-r) 


7 


in 


8 


push eval_op do update 


9 


let right = read (root [RIGHT] ) in 


10 


eval (right) 


11 


in 


12 


update 


13 


let tag = read (root [TAG] ) in 


14 


if (tag == LEAF) 


15 


let leaf _val = read (root [LEAF_VAL] ) in 


16 


pop (Ieaf_val) 


17 


else 


18 


push eval_right do update 


19 


let left = read (root [LEFT] ) in 


20 


eval (left) 



Figure 5. The eval function in I L. 
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int MAX; 




void array jnax (int* arr, int 


len) { 


while (len > 1) { 




for (int i = 0; i < len 


- 1; i += 2) { 


int m; 


inax(arr[i], arr [i + 


1] , &in) ; 


arr[i / 2] = m; 




} 




len = len / 2; 




} 




MAX = arr [0] ; 




} 





Figure 6. Iteratively compute the maximum of an array. 



2 


9 


3 


5 


4 


7 


1 


6 



2 





3 


5 


4 


7 


1 


6 



9 


5 


7 


6 


4 


7 


1 


6 




2 


5 


7 


6 


4 


7 


1 


6 


9 


7 


7 


6 


4 


7 


1 


6 






7 


6 


4 


7 


1 


6 


9 


7 


7 


6 


4 


7 


1 


6 


07 


7 


6 


4 


7 


1 


6 



Figure 7. Snapshots of the array from Figure |6j 

and the three-part structure of this code, including its call- 
/retum dependencies. In particular, it must identify where 
previous computations should be reused, reevaluated or dis- 
cardecj^ In Section 2.3 we discuss how the IL code of Fig- 
ure [5] which represents Figure [2] informs the change propa- 
gation strategy described above. 

2.2 Example 2: Reducing Arrays 

As a second example. Figure [6] gives C code for (destruc- 
tively) computing the maximum element of an array. Rather 
than perform a single linear scan, it finds this maximum iter- 
atively by performing a logarithmic number of rounds, in the 
style of a (sequentialized) data-parallel algorithm. For sim- 
plicity, we assume that the length of arrays is always a power 
of two. Each round combines pairs of adjacent elements in 
the array, producing a sub-sequence with half the length of 
the original. The remaining half of the array contains inac- 
tive elements no longer accessed by the function. 

Rather than return values directly, we illustrate com- 
monly used imperative features of C by returning them indi- 
rectly: function max returns its result by writing to a provided 
pointer, and array _max returns its result by assigning it to a 
special global variable MAX. 

Figure |7] illustrates the computation for two (closely- 
related) example inputs. Below each input, each computa- 
tion consists of three snapshots of the array, one per round. 
For readability, the inactive elements of the array are still 
shown but are greyed, and the differences between the right 
and left computation are highlighted on the right. 

Strategy for change propagation. We use Figure |7] to de- 
velop a strategy for change propagation. Recall that each ar- 

- To see an example where computation is discarded, imagine the change 
in reverse; that is, changing the lower computation into the upper one. 



ray snapshot summarizes one round of the outer while loop. 
Within each snapshot, each (active) cell summarizes one iter- 
ation of the inner for loop. That array _max uses an iterative 
style affects the structure of the computation, which conse- 
quently admits an efficient strategy for change propagation: 
reevaluate each affected iteration of the inner f or loop, that 
is, those summarized by the highlighted cells in Figure |7] 

It is simple to (manually) check that each active cell de- 
pends on precisely two cells in the previous round, affects 
at most one cell in the next round, and is computed inde- 
pendently of other cells in the same round. Hence, for a 
single input change, at most one such iteration is affected 
per round. Since the number of rounds is logarithmic in the 
length of the input array, this change propagation strategy is 
efficient. 

Challenges. To efficiently update the computation, change 
propagation should reevaluate each affected iteration, being 
careful not to reevaluate any of the unaffected iterations. 

2.3 Introduction to I L 

The primary role of I L is to make precise the computational 
dependencies and possible change propagation behaviors 
of a low-level self-adjusting program. In particular, it is 
easy to answer the following questions for a program when 
expressed in IL: 

• Which data dependencies are local versus non-locall 

• Which code fragments are saved on the control stack! 

• Which computation fragments are saved in the computa- 
tion's trace, for later reevaluation or reuse? 

We informally introduce the syntax and semantics of IL 
by addressing each of these questions for the examples in 
Sections [2. 1 1 and [Z2| In Section [3] we make the syntax and 
semantics precise. 

Static Single Assignment. To clearly separate local and 
non-local dependencies, IL employs a (functional variant 
of) static single assignment form (SSA) Within this 
representation, the control-flow constructs of C are repre- 
sented by locally-defined functions, local state is captured 
by let-bound variables and function parameters, and all non- 
local state (memory content) is explicitly allocated within 
the store and accessed via reads and writes. 

For example, we express the for loop from Figure |6] as 
the recursive function for_loop in Figure [8(a) This func- 
tion takes an argument for each variable whose definition is 
dependent on the for loop's control flow[^ in this case, just 
the iteration variable i. Within the body of the loop, the local 
variable m is encoded by an explicit store allocation bound 
to a temporary variable m_ptr. Although not shown, global 
variable MAX is handled analogously. This kind of indirection 
is necessary whenever assignments can occur non-locally (as 

^ Where traditional SSA employs i^-operators to express control-dependent 
variable definitions, functional SSA uses ordinary function abstraction. 
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let for_loop (i) = 

let m_ptr = alloc (1) in 
let after_max() = update 

let m.val = read(m_ptr [0] ) in 
let _ = write (arr [i/2] , m_val) in 
if (i < len - 1) 

then for_loop(i + 2) 
else . . . 

in 

push afterjnax do update 
let a = read(arr[i]) in 
let b = read(arr[i + 1] ) in 

max(a, b, m_ptr) 
in for_loop(0) 

(a) 



let for_loop (i) = 

let ni_ptr = alloc(l) in 
let afterjnaxO = update 

let m.val = readCm.ptr [0] ) in 
let _ = writeCarr [1/2] , m_val) in 
memo 

if (1 < len - 1) 

then for_loop(i + 2) 
else . . . 

in 

push afterjnax do update 
let a = read(arr[i]) in 
let b = read(arr[i + 1]) in 
max (a, b, in_ptr) 
in for_loop (0) 

(b) 



let for_loop (1) = 
let for_next () = 

if (i < len - 1) then for_loop(i + 2) 
else . . . 

in 

push forjiext do 

let m_ptr = alloc (1) in 

let after_max() = update 

let m_val = read (m_ptr [0] ) in 
let _ = writeCarr [i/2] , m_val) in 
pop () 

in 

push afterjnax do update 
let a = read (arr [1]) in 
let b = read(arr[i + 1] ) in 

maxCa, b, m_ptr) 
in for_loop(0) 



(c) 



Figure 8. Three versions of IL code for the for loop in Figure|6} highlighting indicates their slight differences. 



with global variables like MAX) or via pointer indirection (as 
with local variable m). By contrast, local variables arr, i and 
len are only assigned directly and locally, and consequently, 
each is a proper SSA variable in Figure |8(a)| Similarly, in 
Figure [2] the assignments to 1 and r are direct, and hence, 
we express each as a proper SSA variable in Figure |5] We 
explain the other IL syntax from Figures |5] and 8(a)| below 
(push, pop, update, memo). 

Stack operations. As our first example illustrates (See- 
the control stack necessarily breaks a computa- 



2.1 



tion 

tion into multiple fragments. In particular, before control 
flow follows a function call, it first pushes on the stack a 
code fragment (a local continuation) which later takes con- 
trol when the call completes. 

The stack operations of I L make this code fragmentation 
explicit: the expression push / do e saves function / (a code 
fragment expecting zero or more arguments) on the stack 
and continues by evaluating e; when this subcomputation 
pops the stack, the saved function / is applied to the (zero 
or more) arguments of the pop. 

In Figure [5j the two recursive calls to eval are preceded 
by pushes that save functions eval_right and eval_op, 
corresponding to code fragments for evaluating the right 
subtree (fragment two) and applying the binary operator 
(fragment three), respectively. Similarly, in Figure |8(a)| 
the call to max is preceded by a push that saves func- 
tion afterjnax, corresponding to the code fragment fol- 
lowing the call. We note that since max returns no values, 
af ter jmax takes no arguments. 

Reevaluation and reuse. To clearly mark which computa- 
tions are saved in the trace — which in turn defines which 
computations can be reevaluated and reused — IL uses the 
special forms update and memo, respectively. 

The IL expression update e, which we call an update 
point, has the same meaning as e, except that during change 
propagation, the computation of e can be recovered from 
the program's original computation and reevaluated. This 



reevaluation is necessary exactly when the original compu- 
tation of e contains reads from the store that are no longer 
consistent within the context of new computation. 

Dually, the IL expression memo e, which we call a memo 
point, has the same meaning as e, except that during reeval- 
uation, a previous computation of e can be reused in place 
the present one, provided that they match. Two computa- 
tions of the same expression e match if they begin in locally- 
equivalent states (same local state, but possibly different 
non-local state). This notion of memoization is similar to 
function caching |32| in that it reuses past computation to 
avoid reevaluation, but it is also significantly different in that 
impure code is supported, and non-local state need not match 
(a matching computation may contain inconsistent reads). 
We correct inconsistencies by reevaluating each inconsistent 
read within the reused computation. 

We can insert update and memo points freely within 
an existing IL program without changing its meaning (up 
to reevaluation and reuse behavior). Since they allow more 
fine-grained reevaluation and reuse, one might want to in- 
sert them before and after every instruction in the program. 
Unfortunately, each such insertion incurs some tracing over- 
head, as memo and update points each necessitate saving a 
snapshot of local state. 

Fortunately, we can automatically insert a smaller yet 
equally effective set of update points by focusing only on 
reads. Figures[5]and 8(a) show examples of this: since each 
read appears within the body of an update point, we can 
reevaluate these reads, including the code that depends on 
them, should they become inconsistent with memory. We say 
that each such read is guarded by an update point. 

For memo points, however, it is less clear how to automat- 
ically strike the right balance between too many (too much 
overhead) and not enough (not enough reuse). Instead, we 
expose surface syntax to the C programmer, who can in- 
sert them as statements (memo ; ) as well as expressions (e.g.. 



memoCf (x))). In Section 2.4 we discuss where to place 
memo points within our running examples. 
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2.4 Change Propagation Strategies Revisited 



In Sections 2.1 and 2.2 we sketched strategies for updat- 
ing computations using change propagation. Based on the I L 
representations described in Section [23] we informally de- 
scribe our semantics for change propagation in greater detail. 
The remainder of the paper makes this semantics precise and 
describes our current implementation. 

Computations as traces. We represent computations using 
an execution trace, which records the memo and update 
points, store operations (allocs, reads and writes), and stack 
operations (push and pop). 

To a first approximation, change propagation of these 
traces has two aspects: reevaluating inconsistent subtraces, 
and reusing consistent ones. Operationally, these aspects 
mean that we need to decide not only which computations 
in the trace to reevaluate, but also where this reevaluation 
should cease. 

Beginning a reevaluation. In order to repair inconsisten- 
cies in the trace, we begin reevaluations at update points 
that guard inconsistent reads. We identify reads as incon- 
sistent when the memory location they depend on is affected 
by writes being inserted into or removed from the trace. That 
is, a read is identified as affected in one of two ways: when 
inserting a newly traced write (of a different value) that be- 
comes the newly read value, or when removing a previously 
traced write that had been the previously read value. In ei- 
ther case, the read in question becomes inconsistent and can- 
not be reused in the trace without first being reevaluated. To 
begin such a reevalaution, we restore the local state from the 
trace and reevaluate within the context of the current mem- 
ory and control stack, which generally both differ from those 
of the original computation. 

Ending a reevaluation. We end a reevaluation in one of 
two ways. First, recall that we begin reevaluation with a dif- 
ferent control stack than that used by the original compu- 
tation. Hence, we will eventually encounter a pop that we 
cannot correctly reevaluate, as doing so requires knowing 
the contents of the original computation's stack. Instead, we 
cease reevaluation at such pops. We justify this behavior be- 
low and describe how it still leads to a sound approach. 

Second, as described in Section |2.3| when we encounter 
a memo point, we may find a matching computation to 
reuse. If so, we cease the current reevaluation and begin 
reevaluations that repair inconsistencies within the reused 
computation, if any. 

Example 1 revisited. The strategy from Section 2.1 re- 



quires that the previous computation be reevaluated in some 
places, and reused in others. First, as Figure |5] shows, we 
note that however an input tree is modified, update points 
guard the computation's affected reads. We reevaluate these 
update points. For instance, in the given change (of the right 
subtree of a), line 9 has the first affected read, which is 
guarded by an update point on line 8; this point corresponds 



to a2, which we reevaluate first. Second, our strategy reuses 
computation gi-ga. To this end, we can insert a memo state- 
ment at the beginning of function eval in Figure |2] (not 
shown), resulting in the memo point shown on line 1 in Fig- 
ure |5] Since it precedes each invocation, this memo point 
allows for the desired reuse of unaffected subcomputations. 

Example 2 revisited. Recall that our strategy for Sec- 
tion |2.2| consists of reevaluating iterations of the inner 
for loop that are affected, and reusing those that are not. 
To begin each reevaluation within this loop (Figure |8(a)| l, 
we reevaluate their update points. 

Now we consider where to cease reevaluation. Note that 
the update point in af ter_max guards a read, as well as the 
recursive use of f or_loop, which evaluates the remaining 
(possibly unaffected) iterations of the loop. However, recall 
that we do not want reevaluation to continue with the re- 
maining iterations — we want to reuse them. 

We describe two ways to cease reevaluation and enable 
reuse. First, we can insert a memo statement at the end of 
the inner for loop in Figure [6] resulting in the memo point 
shown in Figure [8(b)| Second, we can wrap the for loop's 
body with a cut block, written cut{. . .}, resulting in the 
additional push-pop pair in Figure |8(c)| Cut blocks are op- 
tional but convenient syntactic sugar: their use is equiva- 
lent to moving a code block into a separate function (hence 
the push-pop pair in Figure |8(c)[ ). Regardless of which we 
choose, the new memo and pop both allow us to cease 
reevaluation immediately after an iteration is reevaluated 



within Figures 8(b) and 8(c) respectively. 



Call/return dependencies. Recall from Section 2. 1 that we 



must be mindful of call/return dependencies among the re- 
cursive invocations. In particular, after reevaluating a sub- 
computation whose return value changes, the consumer of 
this return value (another subcomputation) is affected and 
should be reevaluated (as in the example). 

Our general approach for call/return dependencies has 
three parts. First, when proving consistency (Section]?]), we 
restrict our attention to programs whose subcomputations' 
return values do not change, a crucial property of programs 
that we make precise in Section |4] Second, in Section|5] we 
provide an automatic transformation of arbitrary programs 
into ones that have this property. Third, in Section 7.3 we 



introduce one simple way to refine this transformation to re- 
duce the overhead that it adds to the transformed programs. 
With more aggressive analysis, we expect that further effi- 
ciency improvements are possible. 

Contrasted with proving consistency for a semantics 
where a fixed approach for call/return dependencies is 
"baked in", our consistency proof is more general. It stip- 
ulates a property that can be guarenteed by either of the 
two transformations that we describe (Sections [5] and [73] l. 
Furthermore, it leaves the possibility open for future work to 
improve the currently proposed transformations, e.g., by em- 
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let fun f(x).ei in 62 
let a; = ffi (v) in e 
if X then ei else 62 

let a: = i in e 
memo e 
update e 
push / do e 
pop X 

alloc ( a; ) 

write (x[y],z) 
n I X 



Untxaced expression 
Traced expression 

Function definition 
Primitive operation 
Conditional 
Function application 

Store instruction 
Memo point 
Update point 
Stack push 
Stack pop 

Allocate an array of size x 
Read yth entry at x 
Write z as yth entry at x 

Natural numbers, variables 



Figure 9. I L syntax. 



ploying more sophosticated static analysis to further reduce 
the overhead that they introduce. 

2.5 Guide for the Paper 

Section |3] presents the abstract machine semantics for IL, 
including our change propagation semantics. Section |4] 
presents our consistency theorem. Section |5] presents a 
destination-passing style transformation whose target pro- 
grams meet our side condition for consistency. Section [7] 
gives compilation and runtime techniques for our semantics. 
Section [8] describes our implementation. Section [9] gives an 
empirical evaluation. Sections 10 and 11 give related work 
and conclude. 



3. A Self- Adjusting Intermediate Language 

We present IL, a self-adjusting intermediate language, as 
well as two abstract machines that evaluate IL syntax. We 
call these the reference machine and the tracing machine, 
respectively. As its name suggests, we use the first machine 
as a reference when defining and reasoning about the trac- 
ing machine. Each machine is defined by its own transition 
relation over similar machine components. The tracing ma- 
chine mirrors the reference machine, but includes additional 
machine state components and transition rules that work to- 
gether to generate and edit execution traces. This tracing be- 
havior formalizes the notion of I L as a self-adjusting lan- 
guage. 

3.1 Abstract Syntax of I L 

Figure |9] shows the abstract syntax for IL. Programs in IL 
are expressions, which we partition into traced e* and un- 
traced e". This distinction does not constrain the language; 
it merely streamlines the technical presentation. Expres- 



sions in IL follow an administrative normal form (ANF) ll20l 
where (nearly) all values are variables. 

Expressions consist of function definitions, primitive op- 
erations, conditionals, function calls, store instructions (/,), 
memo points, update points, and operations for pushing 
(push) and popping (pop) the stack. Store instructions (t) 
consist of operations for allocating (alloc), reading (read) 
and writing (write) memory. Values v include natural num- 
bers and variables (but not function names). Each expression 
ends syntactically with either a function call or a stack pop 
operation. Since the form for function calls is syntactically in 
tail position, the I L program must explicitly push the stack to 
perform non-tail calls. Expressions terminate when they pop 
on an empty stack — they yield the values of this final pop. 

Notice that IL programs are first-order: although func- 
tions can nest syntactically, they are not values; moreover, 
function names /, 5, h are syntactically distinct from vari- 
ables x,y,z. Supporting either first-class functions (func- 
tions as values) or function pointers is beyond the scope of 
the current work, though we believe our semantics could be 
adapted for these setting^ 

In the remainder, we restrict our attention to programs 
(environments p and expressions e) that are well-formed in 
the following sense: 

1. They have a unique arity (the length of the value se- 
quence they potentially return) that can be determined 
syntactically. 

2. All variable and function names therein are distinct. (This 
can easily be implemented in a compiler targeting IL.) 
Consequently we don't have to worry about the fact that 
IL is actually dynamically scoped. 

3.2 Machine Configurations and Transitions 

In addition to sharing a common expression language (viz. 
I L, Section |3.1| l, the reference and tracing machines share 
common machine components; they also have related tran- 
sition relations, which specify how these machines change 
their components as they run IL programs. 

Machine configurations. Each machine configuration con- 



sists of a handful of components. Figure 10 defines the com- 
mon components of two machines: a store (cr), a stack (k), 
an environment (p) and a command (a^ for the reference ma- 
chine, and at for the tracing machine). The tracing machine 
has an additional component — its trace — which we describe 
in Sections l34land[33] 

A store a maps each store entry {£[n]) to either uninitial- 
ized contents (written _L) or a machine value i'. Each en- 
try £[n] consists of a store location £ and a (natural number) 

For example, to model function pointers, one could adapt this semantics 
to allow a function / to be treated as a value if / is closed by its arguments; 
this restriction models the way that functions in C admit function pointers, a 
kind of "function as a value", even though C does not include features typ- 
ically associated with first-class functions (e.g. implicitly-created closures, 
partial application). 
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offset n. In addition, a store may mark a location as garbage, 
denoted as i i-^ o, in which case all store entries for £ are 
undefined. These garbage locations are not used in the ref- 
erence semantics; in the tracing machine, they help to de- 
fine a notion of garbage collection. A stack k is a (possibly 
empty) sequence of frames, where each frame [p,/J saves 
an evaluation context that consists of an environment p and a 
function / (defined in p). An environment p maps variables 
to machine values and function names to their definitions. 

In the case of the reference machine, a (reference) com- 
mand ttr is either an IL expression e or a sequence of ma- 
chine values V; for the tracing machine, a (tracing) com- 
mand at is either e, T7, or an additional command prop, 
which indicates that the machine is performing change prop- 
agation (i.e., replay of an existing trace). 

Each machine value v consists of a natural number n or 
a store location I. Intuitively, we think of machine values as 
corresponding to machine words, and we think of the store 
as mapping location-offset pairs (each of which is itself a 
machine word) to other machine words. 

For convenience, when we do not care about individual 
components of a machine configuration (or some other syn- 
tactic object), we often use underscores (_) to avoid giving 
them names. The quantification should always be clear from 
context. 

Transition relations. In the reference machine, each ma- 
chine configuration, written cr, k, p, q,., consists of four com- 
ponents: a store, a stack, an environment and a command, as 
described above. In Section [33] we formalize the following 
stepping relation for the reference machine: 



cr, a, 



a , K , p , ar 



Intuitively, the command tells the reference machine what 
to do next. In the case of an expression e, the machine pro- 
ceeds by evaluating e, and in the case of machine values T7, 
the machine proceeds by popping a stack frame [p, /J and 
using it as the new evaluation context. If the stack is empty, 
the machine terminates and the command V can be viewed 
as giving the machine's results. Since these results may con- 
sist of store locations, the complete extensional result of the 
machine must include the store (or at least, the portion reach- 
able from V). 

The tracing machine has similar machine configurations, 
though it also includes a pair (11, T) that represents the 
current trace, which may be in the midst of adjustment; we 



describe this component separately in Sections 3.4 and 3.5 



In Section [3^ we formalize the following stepping relation 
for the tracing machine: 



At a high level, this transition relation accomplishes several 
things: (1) it "mirrors" the semantics of the reference ma- 
chine when evaluating I L expressions; (2) it traces this evalu- 
ation, storing the generated trace within its trace component; 



Store a 

Stack K 

Environment p 

Reference command a, 

Tracing command at 

Machine value v 



e I a{l{n\ h-> _L] 
I a\£\n\ ^ v] I a[£ ^ o] 

— £|«-Lp,/J 

:= £ I p[x i-> v] 
I p[f ^ fun f{x).e] 

:= e\V 

:= a, I prop 

n I £ 



Figure 10. Common machine components. 



P' = P[f ^ fun /(x).ei] 



a, K, p, let fun /(x).ei in 62 — ^ cr, k, p' , 62 
= ^ p' = p[a; i-^ primappC®,!?)] 
cr, K, p, let a; = © (v) in e cr, p', e 

P{^) + 

CT, K, p, if X then ei else 62 — ^ cr, k, p, e\ 

P(^) = 

CT, K, p, if X then ei else 62 cr, k, p, 62 

p(/) = fun /(x).e p' = p[.T, ^ p(x,)][!j;^ 
cr, K, p, / (x) — cr, K, p', e 



R.l 



R.2 



R.3 



R.4 



R.5 



cr, p, i cr', V 



cr, K, p, let a; = t in e — ^ cr', k, p\x n> j/], e 



R.6 



R.7 



cr, K, p, memo e — > a, k, p, e 



R.8 



cr, K, p, update e — > cr, k, p, e 
cr, K, p, push / do e — ^ a, k - [p, /J , p, e 

^-^P^^ R.10 

a, K, p, pop x — > cr, K, £, V 
p{f) ^ fun /(x).e p' ^ p[xi 

o-jK-Lp, /J,£,I? cr,i^,p\e 



R.9 



R.ll 



Figure 11. Stepping relation for reference machine ( — >). 



and (3) it allows previously-generated traces to be either 
reused (during change propagation), or discarded (when they 
cannot be reused). To accomplish these goals, the tracing 
machine distinguishes machine transitions for change prop- 
agation from those of normal execution by giving change 
propagation the distinguished command prop. 
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^^dom(CT) ff' = (7[£[i] ^ 
cr, p, alloc (a:) — ^ <t',j 
a[p{x)[p{y)]) ^ V 



S.l 



fj, p, read(a;[y]) — > a^v 

g' ^ cr[p(a:)[p(y)] ^ p{z)\ 
(7, p, write(a::[?/],z) ^^o-',0 



S.2 



S.3 



Figure 12. Stepping relation for store instructions ( — !■). 



Trace T 
Tr. Action t 
Tr. Context 11 



= t-T\e 

= A,,„ I R,-[„] I W,%j I M,,, I Up.e I (T) 

= £ I n-t I n-D I n-fflr I n-Br 



Figure 13. Traces, trace actions and trace contexts. 



3.3 Reference Machine Transitions 



Figure 1 1 specifies the transition relation for the reference 
machine, as introduced in Section [372] A function definition 
updates the environment, binding the function name to its 
definition. A primitive operation first converts each value ar- 
gument Vi into a machine value vi using the environment. 
Here we abuse notation and write p{v) to mean p{x) when 
V = X and n when v = n. The machine binds the re- 
sult of the primitive operation (as defined by the abstract 
primapp function) to the given variable in the current en- 
vironment. A conditional steps to the branch specified by 
the scmtinee. A function application steps to the body of the 
specified function after updating the environment with the 
given arguments. A store instruction l steps using an auxil- 



iary judgement (Figure 12 1 that allocates in, reads from and 
writes to the current store. An alloc instruction allocates a 
fresh location £ for which each offset (from 1 to the spec- 
ified size) is marked as uninitialized. A read (resp. write) 
instruction reads (resp. writes) the store at a particular lo- 
cation and offset. A push expression saves a return context 
in the form of a stack frame [p, f\ and steps to the body 
of the push. A pop expression steps to a machine value se- 
quence V, as specified by a sequence of variables. If the stack 
is non-empty, the machine passes control to function /, as 
specified by the topmost stack frame [p, f\ , by applying / 
to 17; it recovers the environment p before discarding this 
frame. Otherwise, if the stack is empty, the value sequence V 
signals the termination of the machine with results V. 

3.4 The Structure of the Trace 

The structure of traces used by the tracing machine is spec- 



sequence of zero or more trace actions t. Each action records 
a transition for a corresponding traced expression e*. 

In the case of store instructions, the corresponding ac- 
tion indicates both the instruction and each machine value 
involved in its evaluation. For allocs, the action „ records 
the allocated location as well as its size (i.e., the range of 
offsets it defines). For reads (R^[„]) and writes (W^[„]) the 
action stores the location and offset being accessed, as well 
as the machine value being read or written, respectively. For 
memo expressions, the trace action Mp.e records the body 
of the memo point, as well as the current environment at 
this point; update expressions are traced analogously. For 
push expressions, the action (T) records the trace of eval- 
uating the push body; it is significant that in this case, the 
trace action is not atomic: it consists of the arbitrarily large 

_ subtrace T. For pop expressions, the action V records the 

^ machine values being returned via the stack. 

There is a close relationship between the syntax of traced 
expressions in IL and the structure of their traces. For in- 
stance, in nearly all traced expressions, there is exactly one 
subexpression, and hence their traces t ■ T contain exactly 
one subtrace, T. The exception to this is push, which can be 
thought of as specifying two subexpressions: the first subex- 
pression is given by the body of the push, and recorded 
within the push action as (T); the second subexpression is 
the body of the function being pushed, which is evaluated 
when the function is later popped. Hence, push expressions 
generate traces of the form (T) • T', where T' is the trace 
generated by evaluating the pushed/popped function. 

3.5 Trace Contexts and the Trace Zipper 

As described above, our traces are not strictly sequential 
structures: they also consist of nested subtraces created by 
push. This fact poses a technical challenge for transition se- 
mantics (and by extension, an implementation). For instance, 
while generating such a subtrace, how should we maintain 
the context of the trace that will eventually enclose it? 
To address this, the machine augments the trace with a 



context (Figure 13 1, maintaining in each configuration both 



ified by Figure 13 They each consist of a (possibly empty) 



a reuse trace T, which we say is in focus, as well as an 
unfocused trace context 11. The trace context effectively 
records a path from the focus back to the start of the trace. To 
move the focus in a consistent manner, the machine places 
additional markings □, B, B into the context; two of these 
markings (viz. B, B) also carry a subtrace. We describe these 
markings and their subtraces in more detail below. 

This pair of components (11, T) forms a kind of trace zip- 
per. More generally, a zipper augments a data structure with 
a focus (for zipper (11, T), we say that T is in focus), the 
ability to perform local edits at the focus and the ability to 
move this focus throughout the structure ll2l l26l . A partic- 
ularly attractive feature of zippers is that the "edits" can be 
performed in a non-destructive, incremental fashion. 

To characterize focus movement using trace zippers, we 
define the transition modes of the tracing machine: 
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Evaluation 

(n,T> (n-n,T> (n',r> (n-(T'),T> 

I ^ (E^) I (e!o-8) I (E^) _ I 



^1 



T' ' 



a 



Undoing 

(n,(Ti)-T2> (n-Br^.Ti) {n-BT2,e> (n.Ta) 

' n 



n (u;3) I (TXr-4) I (ul) 

• B ^ B- 



Ti J 




Propagation 

(n,(Ti)-T2> (n-Hr3,Ti> (n',£> (n-(Ti),T2> 



n 



(P.6) 



(P.1-8) 



(P-8) \ 



Figure 14. Tracing transition modes, across push actions. 



U.1-4 



E.0-6 




Figure 15. Tracing machine: commands and transitions. 

• Evaluation mirrors the transitions of the reference ma- 
chine and generates new trace actions, placing them be- 
hind the focus, i.e., (E, T) becomes {U-t, T). 

• Undoing removes actions from the reuse trace, just ahead 
of the focus, i.e., {Ii,t-T) becomes (n,r). 

• Propagation replays the actions of the reuse trace; it 
moves the focus through it action by action, i.e., (11, ^-T) 
becomes (n-<,T). 

If we ignore push actions and their nested subtraces (T), 
the tracing machine moves the focus in the manner just de- 
scribed, either generating, undoing or propagating at most 
one trace action for each machine transition. However, since 
push actions consist of an entire subtrace T, the machine 
cannot generate, undo or propagate them in a single step. 
Rather, the machine must make a series of transitions, pos- 
sibly interleaving transition modes. When this process com- 



pletes and the machine moves its focus out of the subtrace, it 
is crucial that it does so in a manner consistent with its mode 
upon entering the subtrace. To this end, the machine may ex- 
tend the context 11 with one of three possible markings, each 
corresponding to a mode. 

For each transition mode. Figure [14] gives both syntac- 
tic and pictorial representations of the focused traces and il- 
lustrates how the machine moves its focus. The transitions 
are labeled with corresponding (blue) transition rules from 
the tracing machine, but at this time the reader can ignore 
them. For each configuration, the (initial) trace context is il- 
lustrated with a vertical line, the focus is represented by a 
(red) filled circle and the (initial) reuse trace is represented 
by a tree-shaped structure that hangs below the focus. 

Evaluation. To generate a new subtrace in evaluation 
mode (via a push), the machine extends the context 11 to 
XI-D; this effectively marks the beginning of the new sub- 
trace. The machine then performs evaluation transitions that 
extend the context, perhaps recursively generating nested 
subtraces in the process (drawn as smaller, unlabeled trian- 
gles hanging to the left). After evaluating the pop matching 
the initial push, the machine rewinds the current context 11', 
moving the focus back to the mark □, gathering actions and 
building a completed subtrace T'\ it replaces the mark with 
a push action {T') (consisting of the completed subtrace), 
and it keeps reuse trace T in focus. We specify how this 



rewinding works in Section 3.6 intuitively, it simply moves 
the focus backwards, towards the start of the trace. 

Undoing. To undo a subtrace Ti of the reuse trace {Ti)-T2, 
the machine extends the context 11 to n-Bj-a ; this effectively 
saves the remaining reuse trace T2 for either further undo 
transitions or for eventual reuse. Assuming that the machine 
undoes all of Ti, it will eventually focus on an empty trace e. 
In this case, the machine can move the saved subtrace T2 into 
focus (again, for either further undo transitions or for reuse). 

Propagation. Finally, to propagate a subtrace Ti, the ma- 
chine uses an approach similar to undoing: it saves the 
remaining trace T2 in the context using a distinguished 
mark ffl^a . moves the focus to the end of Ti and eventually 
places T2 into focus. In contrast to the undo transitions, how- 
ever, propagation transitions do not discard the reuse trace, 
but only move the focus by moving trace actions from the 
reuse trace into the trace context. Just as in evaluation mode, 
in propagation mode we rewind these actions from the con- 
text and move the focus back to the propagation mark (ffl). 

We note that while our semantics characterizes change 
propagation using a step-by-step replay of the trace, this 
does not yield an efficient algorithm. In Section 7.1 we 



give an efficient implementation that is faithful to this replay 
semantics, but in which the change propagation transitions 
have zero cost. 
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Evaluation 



E.0 


(n,r),a,AC,p,e" - 


{n,T),a,K,p',e 


when 


a, K, p, e" — ^ a, k,, p' ,e 


E.l (n, T), cr, K, /9, let a; = alloc (y) in e — 


(n-Af_p(y),r),(T',K,p[x ^],e 


when 


cr, p, alloc (y) a',£ 


E.2 {n,T), 


a, K, p, let a; = read(y[z]) in e — 




when 




E.3 {n,r),a, 


K, p, let _ = write(a;[y],2:) in e — 


(n-W^) , ,,,T),a ,K,,p,e 

\ p{x)[p{y)]' " ' 


when 


(7, p, write(x[j/J,2) — > a ,0 


E.4 


{n, T) , (T, K, p, memo e — 


(li-Mp,e,i),Cr, K, p, e 






E.5 


(n, T) , cr, K, p, update e — 


-5> {n-Up,e,r),(T, K, p, e 






E.6 


(n, T), cr, ft, p, push / do e — 


(n-n,T),<7,K-Lp,/j,p,e 






E.7 


(n,T),cr,K,p,popS — 


{n-i7, T), cr, K, e, l7 


when 




E.8 


(n,r2),cr, «:-[p, /J,e,y7 — 


{n'.(Ti),T^),a,ft,p',e 


when 


(n,r2);eo* {n'.n,r^);ri 








and 


p(/) = fun /(^).e 








and 


p' = p[a;, i/i]!^^ 


Reevaluation and reuse 








RE 


(n, Up,e-T'),cr, K,e,prop — 


(H-Up e,T) , (J, K, p, e 






E.P 


(n, Mp,e-r), (T, K, p, memo e — 


(n- Mp e, T), cr, K, e, prop 






Propagation 










P.l 


(n, Af,„-r),cr, K,e,prop — 


(n- Af T), o- , K, e, prop 


when 


(7, e, alloc (n) o-',^ 


P.2 


(n, Rf[„j-r),cr,K,e,prop — 


-> (n-RJ'r^i, T), (T, K, £, prop 


when 


cr, e, read (£[nl) ^-)- a, 


P.3 


(n,W,%j-r),a,Ac,e,prop - 


(n-WJ', ,,r),o-',K,e,prop 


when 


(T, e, write (£[nl,i^) 

7 7 LJ' 7 


P.4 


(n, Mp,e-r),cr, K,e,prop — 


-> (n- Mp e, T), cr, K, e, prop 






P.5 


(n, Up,e-r),cr, K,e,prop — 


(n-Up,e,r),(T, K, e,prop 






P.6 


(n,(ri)-r2),<7,K,e,prop - 


(n-fflT2,ri),o-, K, e,prop 






P.7 


(n,i7),(T, K,e,prop — ' 


(n-i7, g), (J, 77 
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{n'.(Ti),r2),a,ft,£,prop 


when 


(n,£);£0* {n'-aT,,e);Ti 


Undoing 










U.l 


(n, A£,„-r), (T, p, Qr — ' 


(n,T),Cr[^ O], K,p, Or 






U.2 


(n,t-T),a,K,p,ar — ' 


(n,T),Cr, K,p, Qr 


when 
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U.3 


(n, (ri)-r2), (T, p, Qr — 


(n-BT2,ri),Cr, K, p, Or 






U.4 


(II-Bt, e), K, p, Or — 


{n,T),Cr, K,p, Qr 







Figure 16. Stepping relation for tracing machine ( 
3.6 Tracing Machine Transitions 



We use the components and transitions of the reference ma- 
chine (Sections |3.2| and [33] respectively) as a basis for defin- 
ing the transitions of the tracing machine. You may recall 



from Section 3.2 that the tracing machine extends the refer- 
ence machine in two important ways. 

First, the machine configurations of the tracing machine 
extend the reference configurations with an extra compo- 



nent (n, T), the trace zipper (Section 3.5 i, which augments 
the trace structure T (Section [3T4l ) with a trace context and a 
movable focus. 

Second, a tracing command at consists of either a ref- 
erence command or the additional propagation com- 
mand prop, which indicates that the machine is doing 
change propagation. Using these two extensions of the ref- 
erence machine, the tracing machine generates traces of 



execution (during evaluation transitions), discards parts of 
previously-generated traces (during undoing transitions), 
and reuses previously-generated traces (during propagation 
transitions). 

These three transition modes (evaluation, undoing and 
propagation) can interact in ways that are not straightfor- 
ward. Figure [15] helps illustrate their interrelationships, giv- 
ing us a guide for the transition rules of the tracing machine. 
The arcs indicate the machine command before and after 
the machine applies the indicated transition rule (written in 
blue). Figure[T6]gives the complete transition relation for the 
tracing machine. Recall that each transition is of the form; 

(n, T) , a, K, p, at ^ (n', T') , a', k', p', at' 

We explain Figure [16] using Figure [15] as a guide. Under 
an expression command e, the machine can take both eval- 
uation (E.0-6) and undo (U.1-4) transitions while remain- 
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ing in evaluation mode, as well as transitions E.P and E.7, 
which each change to another command. Under the prop- 
agation command prop, the machine can take propagation 
transitions (P.1-6) while remaining in propagation mode, as 
well as transitions P.E and P.7, which each change to another 
command. 

Propagation can transition into evaluation (P.E) when it's 
focused on an update action that it (non-deterministically) 
chooses to activate; it may also (non-deterministically) 
choose to ignore this opportunity and continue propaga- 
tion. Dually, evaluation can transition directly into propaga- 
tion (E.P) when its command is a memo point that matches 
a memo point currently focused in the reuse trace (and 
in particular, the environment p must also match); it may 
also (non-deterministically) choose to ignore this opportu- 
nity and continue evaluation. We describe a deterministic 
algorithms for change propagation and memoization in Sec- 
tion O 

Evaluation (respectively, propagation) transitions into a 
value sequence V after evaluating (respectively, propagat- 
ing) a pop operation under E.7 (respectively, P.7). Under the 
value sequence command, the machine can continue to undo 
the reuse trace (U.1-4). To change commands, it rewinds its 
trace context and either resumes evaluation (E.8) upon find- 
ing the mark □, or resumes propagation (P.8) upon finding 
the mark ffl. The machine rewinds the trace using the follow- 
ing trace rewinding relation: 

(n-t,r);T' o {n,T);t-r 
(n.BT,,£);r' O (n,T2);T' 

{IV-BT„t-Ti)-T' O (n,(t-Ti)-T2);r' 

This relation simultaneously performs two functions. First, 
it moves the focus backwards across actions (towards the 
start of the trace) while moving these actions into a new 
subtrace T'; the first case captures this behavior. Second, 
when moving past a leftover undo mark Bts^ it moves the 
subtrace T2 back into the reuse trace; the second and third 
cases capture this behavior Note that unlike B, there is 
no way to rewind beyond either B or B marks. This is 
intentional: rewinding is meant to stop when it encounters 
either of these marks. 

4. Consistency 

In this section we formalize a notion of consistency between 
the reference machine and tracing machine. As a first step, 
we show that when run from scratch (without a reuse trace), 
the results of the tracing machine are consistent with the ref- 
erence machine, i.e., the final machine values and stores co- 
incide. To extend this property beyond from-scratch runs, it 
is necessary to make an additional assumption: we require 
each I L program run in the tracing machine to be composi- 
tionally store agnostic (CSA, see below). We then show that, 
for CSA I L programs, the tracing machine reuses computa- 
tions in a consistent way: its final trace, store, and machine 
values are consistent with a from-scratch run of the tracing 



machine, and hence, they are consistent with a run of the 
reference machine. 

Finally, we discuss some interesting invariants of the trac- 
ing machine (Section [43] l that play a crucial role in the con- 
sistency proof. 

4.1 Compositional Store Agnosticism (CSA) 

The property of compositional store agnosticism character- 
izes the programs for which our tracing machine runs con- 
sistently. We build this property from a less general prop- 
erty that we call store agnosticism. Intuitively, an I L program 
is store agnostic iff, whenever an update instruction is per- 
formed during its execution, then the value sequence that 
will eventually be popped is already determined at this point 
and, moreover, independent of the current store. 

Definition 4.1. Formally, we define SA(cr, p, e) to mean: 
If (T, e, p, e — _, _, p', update e', then there exists v such 

r * 

that w — V whenever _, e, p' , e' — > _, e, e, w. 

To see why this property is significant, recall how the 
tracing machine deals with intermediate results. In stepping 
rule E.8, the tracing machine mirrors the reference machine: 
it passes the results to the function on the top of the con- 
trol stack. However, in stepping rule P.8, the tracing machine 
does not mirror the reference machine: it essentially discards 
the intermediate results and continues to process the remain- 
ing reuse trace. This behavior is not generally consistent with 
the reference machine: If P.8 is executed after switching to 
evaluation mode (P.E) and performing some computation in 
order to adjust to a modified store, then the corresponding 
intermediate result may be different. However, if the subpro- 
gram that generated the reuse trace was store agnostic, then 
this new result will be the same as the original one; conse- 
quently, it is then safe to continue processing the remaining 
reuse trace. 

Compositional store agnosticism is a generalization of 
store agnosticism that is preserved by execution. 

Definition 4.2. We define CSA(f7, p, e) to mean: 
If CT, e, p, e — > tr', n, p' , e', then SA(cr', p' , e'). 

Lemma 4.1. If a, e, p, e — ^ ct', k', p' , e' and CSA(cr, p, e), 
then CSA((t', p' , e'). 

4.2 Consistency of tlie Tracing Macliine 

The first correctness property says that, when run from 
scratch (i.e. without a reuse trace), the tracing machine mir- 
rors the reference machine. 

Tlieorem 4.2 (Consistency of from-scratch runs). 

t * _ 
^/(e,e),cr,e,p,Q:r^ — > {-,-), a' ,£,£,i> 

then a,e, p,ar (T',e,e,V. 

In the general case, the tracing machine does not run from 
scratch, but with a reuse trace generated by a from-scratch 
run. To aid readability for such executions we introduce 
some notation. We call a machine reduction balanced if 
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the initial and final stacks are each empty, and the initial 
and final trace contexts are related by the trace rewinding 
relation. If know that the stack and trace context components 
of a machine reduction meet this criteria, we can specify this 
(balanced) reduction more concisely. 

Definition 4.3 (Balanced reductions). 

(e, T) , (T, e, p, ttr (n,e),(T',e, e,j? 
(n,£);£0* {e,e);T' 

T, a,p,a, ^ T',a',V 

t * 

(£,r),(T,e,e,prop — > {U, e) , a' , e, e,V 

{U,e);eO* {e,e);T' 
T,(7 r\ T' , a' , V 

We now state our second correctness result. It uses an 
auxiliary function that collects garbage; cr|gc(^) — (j{£) for 
£ e dom(cr|gc) = {i\ee dom{a) and a{i) ^ o}. 

Theorem 4.3 (Consistency). 

Suppose e, (Ti, pi, JJ- Ti, a'^^Vi and CSA(cri, pi, a^i). 

1- IfTi,a2,P2,ar2 4 Tl,a'2,ly2 

then £:,(T2|gc,P2,ar2 4 T'(,(T^|gc,I?2 

2. IfTi,cr2 r\ T{,a2,V2 

then £,cr2|gc,Pi,ari -D- T[,a'2\gc,V2 

The first statement says that, when run with an arbitrary 
from-scratch generated trace Ti, the tracing machine pro- 
duces a final trace, store and return value sequence that are 
consistent with a from-scratch run of the same program. 
The second statement is analogous, except that it concerns 
change propagation: when run over an arbitrary from-scratch 
generated trace Ti, the machine produces a result consistent 
with a from-scratch run of the program that generated Ti. 
Note that in each case the initial store may be totally differ- 
ent from the one used to generate Ti . 

Finally, observe how each part of Theorem |4.3| can be 



composed with Theorem 4.2 to obtain a corresponding run 
of the reference machine. 

Collecting the garbage. The tracing machine may undo 
portions of the reuse trace in order to adjust it to a new store. 
Whenever it undoes an allocation (rule U.l), it marks the 
corresponding location as garbage {£ H> o). 

In order for this to make sense we better be sure that these 
locations are not live in the final result, i.e., they neither ap- 
pear in T[ nor V2 nor are referenced from the live portion of 
cr'2 - In fact, this is a consequence of the consistency theorem: 
the from-scratch run in the conclusion produces the same T[ 
and V2- Moreover, since its final store is (T2 |gc, it is clear that 
these components and (72 Igc itself cannot refer to garbage. 

4.3 Invariants 

The proof of Theorem |4.3| is by induction on the length of 
the given from-scratch run producing Ti. It requires numer- 
ous lemmas and, moreover, the theorem statement needs to 



be strengthened in several ways. In the remainder of this sec- 
tion, we explain the main generalizations as they expose in- 
variants of the tracing machine that are crucial for its correct 
functioning Full details of this and all other proofs men- 
tioned later on can be found in the accompanying technical 
appendix. 

Non-empty trace context and stack. Neither the trace con- 
text nor the stack will stay empty during execution, so we 
need to account for that. In part 2 of the generalized version 
of the theorem we therefore assume the following about the 
given from-scratch run (see below for part 1): 

a) CSA((Ti,pi,Q;ri) 

J. * 

b) (ni,e),CTi,Ki,pi,ari — > {Il'^,£),a[,Ki,£,Vi 

c) (n'i,e);eO* (ni,e);Ti 

d) Hi contains neither undo (B) nor propagation (ffl) marks 

When these conditions are all met, we say Ti fsc ("from- 
scratch consistent"). Condition (a) is the same as in the 
theorem statement. Conditions (b) and (c) are similar to the 
assumptions stated in the theorem, except more general: they 
allow a non-empty trace context and a non-empty stack. The 
new condition (d), ensures that the trace context mentioned 
in (b) and (c) only describes past evaluation steps, and not 
past or pending undoing or propagating steps. Apart from 
the assumption, we also must generalize the rest of part 2 
accordingly but we omit the details here. 

Reuse trace invariants. While it is intuitively clear that 
propagation (part 2) must run with a from-scratch generated 
trace in order to generate one, this is not strictly necessary 
for evaluation (part 1). In fact, here the property Ti fsc is 
not always preserved: Recall that in evaluation mode the 
machine may undo steps in Ti . Doing so may lead to a reuse 
trace that is no longer from-scratch generated! In particular, 
if Ti = (t •T2) -Ta, then, using steps U.3, U.2 and eventually 
E.8, the machine may essentially transform this into (T'2)-r3, 
which in general may not have a corresponding from-scratch 
run. 

In order for the induction to go through, we therefore 
introduce a weaker property, Ti ok, for part 1 . It is defined 
as follows: 



e ok 



Tfsc 
Tok 



Tok 



T' ok 



(r)-r' ok 



Note that if Ti Mp,^ • T2 and Ti ok, then Ti fsc (and 
thus T2 fsc) follows by inversion. This comes up in the proof 
precisely when in part 1 evaluation switches to propagation 
(step E.P) and we therefore want to apply the inductive 
hypothesis of part 2, where we need to know that the new 
reuse trace is fsc (not "just" ok). 



^ To our knowledge, this is tlie first work that characterizes the entire trace 
(both in and out of focus), in the midst of adjustment. Such characterizations 
may be useful, for example, to verify efficient implementations of the 
tracing machine. 
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pet fun /(x).ei in 62]^ 
|leta; = ©(y) in ejy 
|if X then ei else 62] ,j 

[/ (my 

pet x = tin e]y 
|memo ejy 
[update ejy 

[push / do e]^^ 
when Arity(/) — n 



[pop Xjy 

when I a; I 



Ip[x ^ 

lp[f h-> fun f(x).e] 



let fun /(x@z).|ei]^ in [ealj^ 

leta:: = ®(y) in |e]y 

if x then leijy else |e2lj^ 

/ (^@y) 

let X = t in |e]j, 

memo |e]j^ 

update {ejy 

let fun f'{z). update 

let xi = read(z[l]) in • • • 
let Xn = read(z[n]) in 

/ (xi, . . ■,Xn,y) 

in 

push /' do memo 

let z = alloc (n) in [ejz 

let _ = write (y [1] ) in • • • 

let _ = write (y[n],a;„) in 

pop (y) 

£ 

lpj[x^i^] 

Ipjlf^ fun, fix© y).le]y] 



Figure 17. Destination-passing-style (DPS) conversion. 

Trace context invariant. In order for Ti ok and Ti fsc to be 
preserved by steps U.4 and P.8, respectively, we also require 
Hi ok, defined as follows: 



e ok 

nok 



nok 
n-t ok 



Tfsc 



n-fflT ok 



nok 
n-D ok 

nok Tok 
U-Bt ok 



Note the different assumptions about T in the last two rules. 
This corresponds exactly to the different assumptions about 
Ti in part 1 and part 2. 

5. Destination-Passing Style 



In Section 1471] we defined the CSA property that the tracing 
machine requires of all programs for consistency. In this sec- 
tion, we describe a destination-passing-style transformation 
and show that it transforms arbitrary IL programs into CSA 
IL programs, while preserving their semantics. The idea is 
as follows: A DPS-converted program takes an additional 
parameter x that acts as its destination. Rather than return its 
results directly, the program then instead writes them to the 
memory specified by x. 



Figure 17 defines the DPS transformation for an expres- 
sion e and a destination variable x, written |e]a;. Naturally, 
to DPS -convert an expression closed by an environment p, 
we must DPS -convert the environment as well, written 
In order to comply with our assumption that all function and 



variable names are distinct, the conversion actually has to 
thread through a set of akeady-used names. For the sake of 
readability we do not include this here. 

Most cases of the conversion are straightforward. The in- 
teresting ones include function definition, function applica- 
tion, push, and pop. For function definitions, the conversion 
extends the function arguments with an additional destina- 
tion parameter z (we write x@z to mean x appended with 
z). Correspondingly, for application of a function /, the con- 
version additionally passes the current destination to /. For 
pushes, we allocate a fresh destination z for the push body; 
we memoize this allocation with a memo point. When the 
push body terminates, instead of directly passing control to 
/, the program calls a wrapper function /' that reads the des- 
tination and finally passes the values to the actual function 
/. Since these reads may become inconsistent in subsequent 
runs, we prepend them with an update point. For pops, in- 
stead of directly returning its result, the converted program 
writes it to its destination and then returns the latter 

As desired, the transformation yields CSA programs 
(here and later on we assume that n is the arity of the pro- 
gram being transformed): 

Theorem 5.1 (DPS programs are CSA). 
CSA(cr, Ipj , let x = alloc (n) in |el^) 

Moreover, the transformation preserves the extensional 
semantics of the original program: 

Theorem 5.2 (DPS preserves extensional semantics). 

//cri,e,p,e -A (j'i,e,e,V 

then CTi , e, |p] , let x = alloc (n) in \e\j. (t[ W ctj , e, £, ^ 
with (j'-^ii^ i) — Vifor all i. 

Because it introduces destinations, the transformed pro- 
gram allocates additional store locations ctj- These locations 
are disjoint from the original store a'^, whose contents are 
preserved in the transformed program. If we follow one step 
of indirection, from the returned location to the values it con- 
tains, we recover the original results v. 

5.1 An Example 

As a simple illustrative example, consider the source-level 
expression / (max {*p,*q)), which applies function / to the 
maximum of two dereferenced pointers *p and *q. Our front 
end translates this expression into the following: 
push / do 
update 

let X = read (p [0] ) in 

let y = read(g[0] ) in 

it X > y then pop x else pop y 
Notice that the body of this push is not store agnostic — 
when the memory contents of either pointer is changed, the 
update body can evaluate to a different return value, namely 
the new maximum of x and y. To address this, the DPS 
transformation converts this fragment into the following: 
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let fun f'(rn). update 

let m' = read(m[0]) in f(.m',z) 

in 

push /' do memo 
let m = alloc (1) in 
update 

let X = read (p [0] ) in 
let y = read(g[0] ) in 
it X > y then 

let _ = write(?n[0],a::) in pop m 
else 

let _ = write (m[0],y) in pop m 

Notice that instead of returning the value of either x or 
y as before, the body of the push now returns the value of 
m, a pointer to the maximum of x and y. In this case, the 
push body is indeed store agnostic — though x and y may 
change, the pointer value of m remains fixed, since it is 
defined outside of the update body. 

The astute reader may wonder why we place the alloca- 
tion of m within the bodies of the push and memo point, 
rather than "lift it" outside the definition of function /'. Af- 
ter all, by lifting it, we would not need to return m to /' 
via the stack pop — the scope of variable m would include 
that of function /'. We place the allocation of m where we 
do to promote reuse of nondeterminism: by inserting this 
memo point, the DPS transformation effectively associates 
local input state (the values of p and q) with the local out- 
put state (the value of m). Without this memo point, every 
push body will generate a fresh destination each time it is 
reevaluated, and in general, this nondeterministic choice will 
prevent reuse of any subcomputation, since this subcompu- 
tation's local state includes a distinct, previously chosen des- 
tination. To avoid this behavior and to allow these subcom- 
putations to instead be reused during change propagation, 
the DPS conversion inserts memo points that enclose each 
(non-deterministic) allocation of a destination. 

6. Cost Models 

We define a generic framework for modeling various dy- 
namic costs of our IL abstract machines (both reference 
and tracing). By instantiating the framework with different 
concrete cost models, we show several cost equivalences 
between the IL reference machine and the IL tracing ma- 
chine (Section[3]l, show that our DPS conversion (Section|5]l 
respects the intensional reference semantics of I L up to cer- 
tain constant factors, and give a cost model for our imple- 
mentation (Sections[7]and[8j. 

Cost model framework. We define machine steps and step 
sequences generically for both the reference and tracing ma- 
chines. Let 5 be a (finite) set of steps, where each step s E S 
corresponds to precisely one stepping rule available to the 
machine in question. For the reference machine, these steps 
consist of R.1-11 (Figure 111, though sometimes we distin- 



4 (Figure 16 1. Given an initial machine state, we define a step 
sequence s as the zero or more steps Si e S taken by some 
execution of the machine until it terminates (with an empty 
stack). No step sequence is defined when the machine fails 
to terminate with an empty stack (i.e., when it either diverges 
or becomes stuck). Note that when the machine permits non- 
deterministic steps, the initial machine state does not fix a 
unique step sequence. 

A cost model is a triple M — (C, 0,7) where: type C 
is the type of costs; the zero cost G C is the cost of an 
empty step sequence; and the cost function 7 : 5 — ?> C ^ C 
assigns to each step s e 5 a function that maps the cost 
before the step s is taken to the cost after s is taken. Given 
an execution sequence s — (si, . . . , s„), we define the cost 
function of s under M as the following composition of cost 
functions: 7 s = (7 Sn)o- • -0(7 si). By assuming zero initial 
cost, we can evaluate this composition of cost functions to a 
yield total cost for sasjsO = cEC. 

Models for steps, stacks and stores. We define several 
basic cost models for measuring machine steps, store usage 
and stack usage. Cost model Ms — (Cg, 0^,75) counts 
machine steps: Cs = M, Os = and js s n ~ n + 1. 
Cost model Alg- = (Co-, 0^,7,7) measures store usage as 
the number of allocations (a), reads (r), and writes (w), 
respectively. We represent these in a triple: — Af^, 
Oo = (0,0,0), and7„ is: 



a + l,r,w) 
a,r + l,wj 
a,r,w + 1) 
a, r, w) 



To instantiate the model for the reference machine we set 

Salloc = R.6/S.1, Sread = R.6/S.2 and Swrite = R.6/S.3; 

similarly, for the tracing machine we set Saiioc = E.l, Sread = 
E.2 and Swiite — E.3. For both machines, we instantiate 
the case of •j^ Snostore for each of the remaining steps. Cost 
model Mf^ = (C^, 0^, 7k) measures the stack usage as the 
number of times the stack is pushed (u), the number of 
times it is popped (d), the current stack height (h), and the 
maximum stack height (m). We represent these as a 4-tuple 
so that Ck = TV* and 0^ = (0, 0, 0, 0). We define 7^ as: 



7^T Salloc 


a 


r, 


w 




Tcr '^read 


a 


r, 


w 




To" ^ write 


a 


r, 


w 




Tcr -^nostore 


a 


r, 


w 





7k s 



pop 



Tk "^nostack \^ 



(u, d, h, m) 
(u, d, h, mS 
m) 



d, h. 



(u + 1, d, h - 
(u,d + l,h ' 
(u, d, h, m) 



1, max(m, h - 
1,to) 



1)) 



To instantiate the model for the reference machine we set 



^push 



R.9 and s„ 



R.ll; similarly, for the tracing 



guish between the sub-cases of R.6 (Figure 12 1. For the trac- 
ing machine, the steps consist of E.0-8,P.E,E.P,P.l-8,U.l- 



machine we set Spush = E.6 and Spop = E.8. Note that the 
stack is actually popped by R.ll rather than R.IO, (resp. 
E.8 versus E.7). The latter steps — which each evaluate a 
pop expression to a sequence of machine values — always 
precede the actual stack pop by one step. 

In from-scratch runs, the costs of the tracing machine are 
equivalent to that of the reference machine. 
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Theorem 6.1. Fix an initial machine state a, e, p, e. Run 

under the reference machine to yield step sequence s". Run 
under the tracing machine with an empty reuse trace to yield 
step sequence s'. The following hold for s" and s'; (1) the 
step counts under Ms are equal; (2) the stack usage under 
is equal; and (3) the store usage under M„ is equal. 

DPS costs. Recall that before I L programs can adjust in a 
consistent way (in the tracing machine), we have to ensure 
that they are compositionally store agnostic, e.g., by DPS- 
converting them (Section|5]l. Below we bound the overhead 
introduced by this transformation in terms of the reference 



machine. By appealing to Theorem 6.1 this bound equiva 



lently applies to the tracing machine as well. 

Theorem 6.2 (DPS preserves intensional semantics). 

Consider the evaluations of expression e and lejx os given 
in Theorem \5.2\ The following hold for their respective step 
sequences, s and s': (1) the stack usage under is equal. 
Let u be the number o/ pushes performed in each; (2) the 
number of allocations under M„ differs by exactly u (ignor- 
ing the initial allocation), the number of reads and writes 
under Ma- each differs by at most a ■ u and a ■ (w + 1), re- 
spectively, where a is the maximum arity of any pop taken in 
s; (3) the number of steps taken under Ms differs by at most 
(2- a + 5) -u + a. 

Realized costs. Realized costs closely resemble those 
of a real implementation. We model them with Mt — 
(Ct, Ot, 7i), which partitions step counts of the tracing ma- 
chine into evaluation (e), undo (u) and propagation (p) step 
counts. As in previous work, our implementation does not 
incur any cost for any propagation steps taken — these steps 
are effectively skipped. Therefore, we define the realized 
cost of {e,p, u) £ Ct as (e + u) G Af. These realized costs 
are proportional to the actual work performed by IL pro- 
grams compiled by our implementation (Sections [7]and|8]|^ 
Each cost is a triple; 

Ot = (0,0,0) 
7t Seval (e,p,M) = (e + 1,p,m) 

7t Sundo {e,p, u) = (e, p, u + 1) 

(Here Sgvai matches steps E.0-8, Sp,-op matches steps P.1-8, 
and Sundo matches steps U.1-4.) 

7. Compiling IL 

While the semantics of IL are given by an abstract ma- 
chine (Section [3]), in actuality we want to run I L programs 
with a more conventional machine — e.g., a machine that 
does not support tracing or change propagation directly. As 
such, our compilation process can be thought of as building 

* The implementation cost may involve an additional logarithmic factor, 
e.g., to maintain a persistent view of the store for every point in the trace. 



a specialized tracing machine for a given IL program. At a 
high level, realizing this machine requires realizing each of 
its components, i.e., realizing its store, stack, environment, 
trace and stepping rules. 

7.1 Runtune data structures and algorithms 

The primary role of the runtime system is to provide realized 
versions of the abstract machine's trace and store, an effi- 
cient search for matching memo points, and an efficient al- 
gorithm for change propagation. To give an efficient change 
propagation algorithm, it is crucial that the runtime trace and 
store be "entangled", i.e., mutually referential: the runtime 
store references certain runtime trace actions, and the run- 
time representation of read and write trace actions each ref- 
erence the runtime store. 

The runtime trace. At a high level, the trace provides an 
ordering to trace actions. For efficiency, we use a (total) 
order maintenance data structure |17| which bestows each 
trace action t an associated time stamp s{t)\ these times- 
tamps admit an efficient predicate for checking if ti < t2 
by checking if s{ti) < 5(^2)- Concretely, a trace node is a 
record consisting of a time stamp s and (at least one) trace 
action t. As a refinement to this approach, below we also 
consider when and how several trace actions can share a sin- 
gle trace node. Most of the trace actions are straightforward 
to represent during runtime, though extra care is needed for 
read and write actions, which we describe below. 

The runtime store. While the store of the abstract ma- 
chine only retains the current value of each location-offset 
entry £[i] (hereafter, just an entry), this generally requires 
traversing and replaying the entire trace during change prop- 
agation, which is prohibitively expensive. As such, the run- 
time store takes a different tack: for each entry, it persis- 
tently maintains all the corresponding read and written val- 
ues, across the entire trace, including the corresponding trace 
action. Given a particular point in the trace, we quickly ac- 
cess the current value of any entry based on when it was last 
read or written, in terms of the time stamps described above. 
For this purpose, the runtime uses a self-balancing search 
tree for each changeable store entry; each node of the tree 
corresponds to a read or write trace action. 

The runtime memo table. In the abstract machine, mem- 
oization permits trace reuse by matching memo points in 
the reuse trace. In the runtime, a hash table indexes each 
such memo point in the trace. While evaluating a new memo 
point, the runtime system attempts to locate matches using 
this hash table. Once matched, the change propagation algo- 
rithm begins working on the reused trace. 

Change propagation as an algorithm. In the abstract ma- 
chine, change propagation has two purposes: to replay store 
effects (viz. allocs, writes) and to ensure that every reused 
read action is consistent with the current store. However, the 
machine specifies change propagation as a complete traver- 
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sal of the trace, while in practice this is not efficient. As 
a result, the algorithm performs change propagation some- 
what differently, while still accomplishing its two high-level 
goals: replaying store effects and ensuring that reads are 
consistent. 

First, to replay store effects, the algorithm relies on the 
runtime store being retained from one run to the next. That 
is, the final store of one run becomes the initial store of 
the next run. This retention is modulo changes in some 
store entries, and the reclamation of locations marked as 
garbage. Consequently, since the runtime store keeps every 
traced effect — not just the most recent ones, as in the abstract 
machine — it is not necessary to replay these effects for the 
benefit of updating the store. 

Second, to find and reevaluate inconsistent read actions, 
the runtime maintains a priority queue Q of them, ordered 
by their appearance in the trace. Rather than find them 
one-by-one via trace traversal, when the runtime store is 
changed, it uses the runtime store representation described 
above to identify any inconsistent reads and enqueue the 
smallest enclosing update point into Q. To find this update 
point quickly, each read action maintains a reference to this 
(unique) enclosing update action. The change propagation 
algorithm consists of a loop that reevaluates the update 
points in Q, in trace order. 

7.2 Compilation 

We compile I L programs in several phases. First, we convert 
them into destination-passing style; this ensures that they 
will replay correctly during change propagation. Next, we 
implement each traced expression with a corresponding call 
into the runtime, described above. For most traced forms, 
this is very straightforward; however, handling the update 
and memo forms requires more care, which we discuss be- 
low. Finally, we translate the resulting IL program into our 
target language, C, and compile this code with gcc. 

Compiling update and memo. In contrast to the other 
traced forms, memo and update each save and restore the 
local state of an I L program — an environment p and an IL ex- 
pression e. To compile these forms, the following questions 
arise: How much of the environment p should be recorded 
in the runtime trace and/or memo table? Once an I L expres- 
sion e is translated into a target language, how do we reeval- 
uate it during change propagation? 

First, we address how we save the environment. At each 
of these points we use a standard analysis (e.g., |30|) to 
approximate the live variables LV(e) at each such e, and then 
save not p, but rather p\\yj{e)7 P limited to LV(e). This has 
two important consequences: we save space by not storing 
dead variables in the trace, and we (monotonically) increase 
the potential for memo matches, as non-matching values of 
dead variables do not cause a potential match to fail. 

Second, we address the issue of fine-grained reevalua- 
tion. This poses a problem since languages such as C do 



not allow programs to jump to arbitrary control points, 
e.g., into the middle of a procedure. To address this limita- 
tion, we adapt the "lambda-lifting" technique used in earlier 
work B23II . Originally this technique transformed the control 
flow graphs of C code; we modify it for IL such that after 
being applied, all update points have the form update f{x) 
where / is a top-level function and where variables Xi ^ x 
close the body of /. In this form, we implement each update 
point as an explicitly-constructed function closure, i.e., a 
record consisting of a function pointer and values for its 
arguments. 

7.3 Optimizations 

We refine the basic approach above with two optimizations. 

Trace node sharing (share). The basic runtime system 
(Section |7.1[ l assigns each trace action i to a distinct trace 
node, with a distinct time stamp s. Since each trace node 
brings some overhead, it is desirable if sequences of con- 
secutive trace actions t — ti, . . .tn can share a single trace 
node with a single time stamp. However, this optimization is 
complicated by a few issues. 

First, how do we realize the comparison ti < tj when ti 
and tj share a single time stamp? We can accomplish this 
by following the order of t when placing the actions into the 
trace node; this allows us to efficiently compare ti with tj by 
comparing their addresses. 

Second, how do we avoid breaking the sequence when 
it uses a single trace node? This can happen in one of two 
ways: by either memo-matching some action in the middle 
of t, thereby discarding its prefix; or by reevaluating an 
update point in the middle of t when this reevaluation takes 
a new control path. We avoid these scenarios by packing 
sequence t into a single trace node only when the following 
criteria are met: if t contains a memo point, then it appears 
first; if t contains an update point, then the remaining suffix 
of t is generated by straight-line code. 

Selective destination-passing style (seldpsj. The DPS 



conversion (Figure 17 1 introduces extra I L code for push and 
pop expressions: an extra alloc, update, memo, and some 
writes and reads. Since each of these expressions are traced, 
this can introduce considerable overhead for subcomputa- 
tions that do not interact with changing data. In fact, without 
an update point, propagation over the trace of e will always 



yield the same return values (Lemma A. 19 1. Moreover, it is 
clear from the definition of store agnosticism (Section |4.1| l 
that any computation without an update point is trivially 
CSA, hence, there is no need to DPS-convert it. By doing a 
conservative static analysis, our compiler estimates whether 
each expression e appearing in the form push / do e can 
reach an update point during evaluation. If not, we do not 
apply the DPS conversion to push / do e. We refer to this 
refined transformation as selective DPS conversion. 
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8. Implementation and a C Front End 

Our current implementation consists of a compiler and an 
associated runtime system, as outlined in Section |7] Ad- 
ditionally, we also implement the optimizations from Sec- 
tion 7.3 After compiling and optimizing IL, our implemen- 



tation translates it to C, which we compile using gcc. In all, 
our compiler consists of a 10k line extension to CIL and our 
runtime system consists of about 6k lines of C code. We plan 
to publicly release the system in summer 201 1; in the mean- 
time, we happily offer it to reviewers upon request. 

As a front-end to IL, we support a C-like source language, 
Csrc. We use CIL 131] to parse Csrc source into a control-flow 
graph representation. To bridge the gap between this repre- 
sentation and IL, we utilize a known relationship between 
static single assignment (SSA) form and lexically-scoped, 
functional programming 1 1 1 1 . 

Before this translation, we move Csrc variables to the heap 
if either they are globally-scoped, aliased by a pointer (via 
Csrc's address-of operator, &), or are larger than a single 
machine word. When such variables come into scope, we 
allocate space for them in the heap (via alloc); for global 
variables, this allocation only happens once, at the start of 
execution. 

As apart of the translation to IL, we automatically place 
update points before each read (or consecutive sequence 
of reads). Though in principle we can automatically place 
memo points anywhere, we currently leave their placement 
to the programmer by providing a memo keyword in Cs^; 
this keyword can be used as a Csrc statement, as well as a 
wrapper around arbitrary C^rc expressions. 

8.1 Current Limitations 

Our source language Csrc is more restricted than C in a few 
ways, though most of these restrictions are merely for tech- 
nical reasons and could be solved with further compiler en- 
gineering. First, while Csrc programs may use variadic func- 
tions provided by external libraries (e.g., printf ), Csrc does 
not currently support the definition of new variadic func- 
tions. Furthermore, function argument and return types must 
be scalar (pointer or base types) and not composite types 
(struct and union types). Removing these restrictions may 
pose engineering challenges, but should not require a funda- 
mental change to our approach. 

Second, our Csrc front-end assumes that the program's 
memory accesses are word aligned. This assumption greatly 
simplifies the translation of pointer dereferencing and as- 
signment in Csrc into the read and write instructions in I L, 
respectively. To lift this restriction, we could dynamically 
check the alignment of each pointer before doing the access, 
and decompose those accesses that are not word-aligned into 
one (or two) that are. 

Third, as a more fundamental challenge, Csrc does not 
currently support features of C that change the stack dis- 
cipline of the language, such as set jmp/longjmp. In C, 



these functions are often used to mimic the control operators 
and/or exception handling found in higher-level languages. 
Supporting these features is beyond the scope of this paper, 
but remains of interest for future work. 

Finally, to improve efficiency, programs written in Csrc 
can be mixed with foreign C code (e.g., from a standard C 
library). Since foreign C code is not traced, it allows those 
parts of the program to run faster, as they do not incur the 
tracing overhead that would otherwise be incurred within 
Csrc- However, mixing of Csrc and foreign C code results 
in a programming setting that is not generally sound, and 
contains potential pitfalls. In particular, in this setting pro- 
grams must adhere to the following correct usage restriction 
to ensure the consistency of change propagation: each mem- 
ory location is either accessed exclusively by foreign C code 
(not by Csrc code) or exclusively by Csrc code (not by for- 
eign C code). While a skilled programmer can observe this 
restriction (we mix foreign C code with Csrc code for some 
of our benchmarks), we currently provide no static or dy- 
namic check that this restriction is met. Such checks pose 
interesting challenges for future work. 

9. Evaluation 

We empirically evaluate our approach by considering a num- 
ber of benchmarks written in Csrc (Section [8]), compiled 
with our compiler (Section[7]l. Our experiments are very en- 
couraging, showing that our approach can yield asymptotic 
speedups, resulting in orders of magnitude speedups in prac- 
tice; it does this while incurring only moderate overheads for 
pre-processing or initial executions. We evaluate our com- 



piler and runtime optimizations (Section 7.3 i, showing that 
they improve performance of both from-scratch evaluation 
as well as of change propagation. Comparisons with previ- 
ous work using the unsound CEAL library and the DeltaML 
language shows that our approach performs competitively. 

9.1 Benchmarks and Measurements 

Our benchmarks consist of expression tree evaluation (i.e., 
the example from Section |2]l, some list primitives, two sort- 
ing algorithms and several computational geometry algo- 
rithms. For our timings, we used a Linux box running on a 
L8 GHz Intel Xeon (4-core) processor with 5 12GB memory. 
All our benchmarks are sequential and are compiled with 
gcc -03 after translation to C. 

For each benchmark, we measure the from-scratch time, 
the time to run the benchmark from-scratch on a particular 
input, and the average update time, the average time required 
by change propagation to update the output after inserting or 
deleting an element from its input. We compute this average 
by iterating over the initial input, deleting each input ele- 
ment, updating the output by change propagation, inserting 
the element again and updating the output by change propa- 
gation. 
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List primitives. These benchmarks include filter, map 
(performs integer additions per element), reverse, minimum 
(integer comparison), and sum (integer addition), and the 
sorting algorithms quicksort (string comparison) and merge- 
sort (string comparison). We generate lists of n (uniformly) 
random integers as input for the list primitives. For sorting 
algorithms, we generate lists of n (uniformly) random, 32- 
character strings. We implement each list benchmark men- 
tioned above by using an external C library for lists, which 
our compiler links against the self-adjusting code after com- 
pilation. 

Computational geometry. These benchmarks include quick- 
hull, diameter, and distance; quickhull computes the convex 
hull of a point set using the standard quickhull algorithm; 
diameter computes the diameter, i.e., the maximum distance 
between any two points of a point set; distance computes 
the minimum distance between two sets of points. Our im- 
plementations of diameter and distance use quickhull to 
compute first the convex hull and then compute the diameter 
and the distance of the points on the hull (the furthest away 
points lie on the convex hull). For quickhull and distance, 
input points are selected from a uniform distribution over 
the unit square in M^. For distance, we select equal numbers 
of points from two non-overlapping unit squares in M^. We 
represent real numbers with double-precision floating-point 
numbers. As with the list benchmarks, each computational 
geometry benchmark uses an external C library; in this case, 
the external library provides geometric primitives for creat- 
ing points and lines, and computing simple properties about 
them (e.g., line-point distance). 

Benchmark targets. In order to study the effectiveness of 



the compiler and runtime optimizations (Section 7.3 i, for 
each benchmark we generate several targets. Each target is 
the result of choosing to use some subset of our optimiza- 
tions. Table[T]lists and describes each target that we consider 
Before measuring the performance of these targets, we use 
regression tests to verify that their self-adjusting semantics 
are consistent with conventional (non-self-adjusting) ver- 
sions. These tests empirically verify our consistency theorem 
(Theorem |43|. 



Target Optimizations used 



no-opt 
share 

seldps 



opt 



No optimization is used. 
Same as no-opt except that certain trace ac- 
tions can share trace nodes. 
Same as no-opt except that the DPS transfor- 
mation is selective — only certain functions 
are transformed. 
Both seldps and share are used. 



Table 1. Targets and their optimizations (Section[73]l. 



■ no-opt 
□ share 

■ seldps 

■ opt 



□ no-opt 

□ share 

■ seldps 

■ opt 



Figure 18. Comparison of benchmark targets. 



9.2 Optimizations 

Figure [18] compares our targets' from-scratch running time 
and average update time. Each bar is normalized to the no- 
opt target. The rightmost column in each bar graph shows 
the mean over all benchmarks. To estimate the efficacy of 
an optimization X, we can compare target no-opt with the 
target where X is turned on. 

In the mean, the fully optimized targets (opt) are nearly 
30% faster from-scratch, and nearly 50% faster during au- 
tomatic updates (via change propagation), when compared 
to the unoptimized versions (no-opt). These results demon- 
strate that our optimizations, while conceptually straightfor- 
ward, are also practically effective: they significantly im- 
prove the performance of the self-adjusting targets, espe- 
cially during change propagation. 

9.3 Summary of Experimental Results 



Table |9.1| summarizes the self-adjusting performance of the 
benchmarks by comparing them to conventional, non-self- 
adjusting C code. From left to right, the columns show the 
benchmark name, the input size we considered (N), the time 
to run the conventional (non-self-adjusting) version (Conv), 
the from-scratch time of the self-adjusting version (FS), the 
preprocessing overhead associated with the self-adjusting 
version (Overhead is the ratio FS/Conv), the average up- 
date time for the self-adjusting version (Ave. Update) and 
the speed-up gained by using change propagation to update 
the output versus rerunning the conventional version (Speed- 
up is the ratio Conv/Ave. Update). All reported times are 
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10^ 


0.18 


1.53 
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map 


10^ 


0.10 


1.87 
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reverse 


10^ 


0.10 


1.81 
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filter 
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0.13 


1.42 
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sum 
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0.14 


1.35 


9.6 
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1.36 
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9.5 X 10-"* 
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3.7 
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0.26 


0.90 


3.4 


1.5 X lO-'' 


1.8 X 10^ 
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10^ 


0.24 


0.81 


3.4 
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7.9 X 10^ 



Table 2. Summary of benchmark results (using opt target of each benchmark). 
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Figure 19. minimum, quictchull and quictcsort performance in DeltalVlL, CEAL and our own opt versions (labeled SASM). 



in seconds. For the self-adjusting versions, we use the opti- 
mized (opt) target of each benchmark. 

The preprocessing overheads of most benchmarks are less 
than a factor of ten; for simpler Hst primitives benchmarks, 
this overhead is about 18 or less. However, even at these only 
moderate input sizes (viz. 10^ and 106), the self-adjusting 
versions deliver speed-ups of two, three or four orders of 



magnitude. Moreover, as we illustrate below (Section 9.4 1, 
these speedups increase with input size. 

9.4 Comparison to Past Work 

To illustrate how our implementation compares with past 



systems. Figure 19 gives representative examples. It com- 
pares the from-scratch and average update times for three 
self-adjusting benchmarks across three different implemen- 
tations: one in DeltaML f27l, one in CEAL 123J and the 



opt target of our implementation (labeled SASM, for 5elf- 
Adjusting S'tack Machines). In the from-scratch graphs, we 
also compare with the conventional (non-self-adjusting) C 
implementations of each benchmark (labeled Conv). 

The three benchmarks shown (viz. minimum, quickhull 
and quicksort) illustrate a general trend. First, in from- 
scratch runs, the SASM implementations are only slightly 
slower than that of CEAL, while the DeltaML implementa- 
tions are considerably slower than both. For instance, in the 
case of quicksort, the DeltaML implementation is a factor 
of ten slower than our own. While updating the computation 
via change propagation, the performance of the SASM im- 
plementations lies somewhere between that of DeltaML and 
CEAL, with CEAL consistently being either faster than the 
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others, or comparable to SASM. Although not reported here, 
we obtain similar results with other benchmarks. 

10. Related Work 

We discuss most closely related work in the previous sec- 
tions of the paper, especially Section [T] Here, we briefly 
characterize earlier work on incremental computation and 
more recent work on generalizing self-adjusting-computation 
techniques to support parallel computation. 

Of the many techniques proposed to support incremental 
computation (see the survey [33}), the most effective ones 
are dependence graphs, memoization, and partial evaluation. 
Dependence graphs record the dependencies between data 
in a computation and rely on a change-propagation algo- 
rithm to update the computation when the input is modi- 
fied (e.g., fTF/ZSl). Dependence graphs are effective in some 
applications, e.g. syntax-directed computations, but are not 
general-purpose because change propagation does not up- 
date the dependencies. For example, the INC language 137)1 . 
which uses dependence graphs, does not permit recursion. 
Memoization (also called function caching) (e.g., ^ |24l 
[32l ) applies to any purely functional program and therefore 
is more broadly applicable than dependence graphs. This 
classic idea dating back to the late 1950's ifTH l28l |29l can 
improve efficiency when executions of a program with sim- 
ilar inputs perform similar function calls. It turns out, how- 
ever, that even a small input modification can prevent reuse 
via memoization, e.g., when they affect computations deep 
in the call tree |7|. Partial evaluation approaches 1,19. 361 re- 
quire the user to fix a part of the input and specialize the 
program to speedup modifications to the remaining unfixed 
part. The main limitation of this approach is that it allows 
input modifications only within a predetermined partition. 

In addition to the early systems discussed above, a more 
recent system, DITTO [34], offers support for incremental 
invariants-checking in Java. It requires no programmer an- 
notations but only supports a purely-functional subset of 
Java. DITTO also places further restrictions on the pro- 
grams; while these restrictions are reasonable for expressing 
invariant checks, they also narrow the scope of the approach. 

More recent work generalized self-adjusting computa- 
tion techniques to support parallel computations. A paper 
presents an algorithm for parallel change propagation [22); 
other papers consider apply parallel self-adjusting computa- 
tion to individual problems ||9l[35], as wefl the map-reduce 
framework fT3\, a more general setting. 

11. Conclusion 

We described a sound abstract machine semantics for self- 
adjusting computation based on a low-level intermediate lan- 
guage. We implemented this language by presenting compi- 
lation and optimization techniques, including a C-like front 
end. Our experiments confirm that the self-adjusting pro- 
grams produced with our approach often perform asymp- 
totically faster than full reevaluation, resulting in orders of 



magnitude speedups in practice. We also confirmed that our 
approach is competitive with past approaches, which are ei- 
ther unsound or unsuited to low-level settings. 
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A. Proofs for Consistency 

Definition A.l (SA). We define SA(cr, p, e) to mean the following: 

If cr, e, p, e — ^ _, _, p', update e', then there exists V such that w = V whenever _, e, p' , e' — ^ _, e, e, w. 
Definition A.2 (CSA). We define CSA((t, p, e) to mean the following: 

r * 

If cr, £, p, e — > cr', K, p', e', then SA(cr', p' , e'). 
Definition A.3. We write noreuse((n, T)) if and only if 

1. T = £ 

2. B_ ^ n 

3. ffl_ ^ n 

Definition A.4 (From-scratch consistent traces). A trace T is from-scratch consistent, written T fsc, if and only if there exists 
a closed command (p, ar), store cr, and trace context 11 such that 

1. CSA(cr, p, ar) 

2. noreuse((n, £)) 

t * _ 

3. (n, £),cr, K, p, ar > (n',£),Cr',K, £, 

4. (n',£);£0* (n,£);T 

t " 

Lemma A.l (From traced to untraced). If {U., T),a, k, p, ar — > (H', T'),a', k', p' , a/ using only E.* and U.* then we also 
have cr, k, p, ar — )■ a' ,k' ,p' , ar'. 

Proof. By inducton on n. When n = 0, the claim is obvious. For n > 0, we inspect the first step taken. In each possible case, 
it is easy to verify that the claim follows by induction. □ 

Definition A.5 (Okay traces). 

Ti ok T2 ok T fsc 

£ok (Ti)-T2ok Tok 

Definition A.6 (Okay trace contexts). 

nok nok nok Tfsc nok Tok 



£ok n-tok n-Dok n-fflTok n-Brok 

Definition A.7 (Okay trace zippers). 

n ok T ok 



(n,T) ok 

Lemma A.2 (Rewinding is okay). //(H, T); _ O* (n', T'); _ and (H, T) ok then (H', T) ok 
Proof Trivial induction around the following case analysis of a rewind step. 

• Case (ni-t,r);_o (ni,r);_ 

■ By assumption, IIi • < ok and T ok 

■ By inversion, IIi ok 

■ Hence, {Ili,T) ok. 

• Case (ni-BTi,£);-0 (ni,Ti);_ 

■ By assumption Hi • ok 

■ By inversion. Hi ok and Ti ok 

■ Hence, (ni,Ti) ok. 

• Case (ni-BT,,t-ri);_0 (ni,(f-Ti)-T2);_ 

■ By assumption Ei -B^^ and t-Ti ok 
• By inversion. Hi ok and T2 ok 

■ Hence, {t-Ti)-T2 ok 
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■ And(ni,(i-Ti)-T2) ok. 

□ 

t * 

Lemma A.3 (Purity). If{'n.i,Ti),(7i,Ki,p,ar — > {ni,Ti),ai,Ki, p' , a/ using E.O only, then for any II2, T2, a2, K2we 

have (n2,T2),f72,K2,P,Q!r (n2,22),tT2,K2,P',Q:/. 

Proof. Trivial induction. □ 
Lemma A.4 (Rewinding). 7f(n', _); _ 0* (11, _); then 

7. n e Prefixes(n'), 

2. #n(n') = #n(n), and 

3. #ffl(n') = #ffl(n). 

Proof. By induction on the number n of rewinding steps. If n = 0, then 11 = 11' and the claim holds trivially. Suppose 
n= 1 + n' . Case analysis on the first step: 

• Case (n',TO;T^ (n",ri');t-T^ O"' (n,Ti) ;T2 withH' = li" -t: 

■ By induction we get H e Prefixes(n") A #n(n") = #n(n) A #ffl(n") = #ffl(n). 

■ This implies the claims. 

• Case (n', T{); O (H", T); O"' (H, Ti) ; T2 with T{ = e and H' = n"-BT: 

■ By induction we get H e Prefixes(n") A #n(n") = #n(n) A #ffl(n") = #ffl(n). 

■ This implies the claims. 

• Case (n', T[)-T!^ O (n", (T{)-T); O"' (H, Ti) ; T2 with T{ = i-r{' and H' = n"-BT: 

■ By induction we get H e Prefixes(n") A #n(n") = #n(n) A #ffl(n") = #ffl(n). 

■ This implies the claims. 

□ 

Lemma A.5. 7/11 e Prefixes(n'), then dropg(a) e Prefixes(dropg(n')). 

Lemma A.6. 7/(n,Ti);T0* {W , _);T', then (dropH(n), Ti); T O* (dropH(n'), _); T'. 

Proof. By induction on the number n of rewinding steps. If n = 0, then 11 = 11' and T = T', so the claim holds obviously. 

Now suppose n > 0. We inspect the last step: 

• Case (n,ri);TO* (n'-t,_);r2 (3 (H', _) ; t with T' = t-T2: 

■ By induction, (dropH(n), Ti); T O* (dropg(n'-t), _); T2 O (dropg(n'), _) ; t-r2. 

• Case (n, Ti);TO* {W ■ B_, _) ; T' O (H', _) ; T': 

■ By induction, (dropH(n), Ti); T 0* (dropH(n'-BJ, _); T' = (dropH(n'), _) ; T'. 

□ 

Lemma A.7 (Trace actions stick around (prefix version)). If 

1. (ni.T2-n2,Ti),_,_,_,_^* (Ha, _),_,_, _,_ 

2. dropg(ni) e Prefixes(dropg(n3)) 

then dropg(ni-T2) G Prefixes(dropg(n3)). 

Proof. By induction on the length n of the reduction chain. If n = 0, then 1X3 = Hi •T2 • 112 and thus the claim is obvious. Now 
consider n = 1 + n'. We inspect the first step: 

• Case E.O, U.1-2: 

t 

■ Then (ni-T2-n2, T,), _ ^ {U^, 

■ The claim then follows by induction. 

• Case E.1-5,7, E.P, P.E,l-5,7: 

t 

■ Then (Hi • Ta • Ha • Ti> ^ (03, _) , _, _, _, _, for some t. 
• The claim then follows by induction. 
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• Case E.6: 

■ Then (Hi • ^2 • n2 •□, Ti ),_,_,_,_ ^ {U-^ . 

■ The claim then follows by induction. 

• Case P.6: 

■ Then (Hi • • n2 • fflr , Ti ),_,_,_,_ (Hs ,_),_,_,_, _. 

■ The claim then follows by induction. 

• Case U.3: 

■ Then (Hi • ^2 • n2 • Bt , Ti ),_,_,_,_ ^ (Hg ,_),_,_,_, _. 

■ The claim then follows by induction. 

• Case E.8: 

■ Subcase (Hi -Ts -Hs, Ti) ; e O* (Hi -Ta -H^, •□, r{) ; Tg: 

- Then (Hi • ■ H'^ • (Tg), T{ ) , _, _, _ ^ (Hg, _) , _, _, _. 

- The claim then follows by induction. 

■Subcase (Hi -Ts -Hs, Ti) ; e O* (ni,_);_0* (Hi •□, r{) ; T3: 

- Then (n'l • (Tg), T{), _ ^ (Hg, . 



/ T -7 -T ■ 



— By Lemma A. 4 we have H'l-O E Prefixes(ni). 

— Hence, using Lemma |A.5| dropg(n']^), dropg(n']^ ■□) e Prefixes(dropg(ni)) C Prefixes(dropg(n3)). 

— Hence dropg(n'2-(T3)) e Prefixes(dropg (Ha)) by induction, contradicting dropg(n'j •□) e Prefixes(dropg(n3)). 
• Case P.8: 

■ Subcase (Hi -Ta -Ha, Ti) ; e O* (Hi -H'j -Ht, e) ; J3 where Ti = e: 

— Then (Hi • r2 • n'2 • (r3 ), T) ^ (n3 ,_),_,_,_, _. 

— The claim then follows by induction. 

■ Subcase (Hi -Ta -Ha, Ti) ; e O* (Hi,.) ;_0* {U[-mT,e) where Ti = e: 

— Then (n'i.(r3),T), (n3 



M -1 -? - 



By Lemma A. 4 we have H'^^ -Ht G Prefixes(ni] 



— Hence, using Lemma |A.5| dropg(n'^), dropg(n2 -Bt) G Prefixes(dropg(ni)) C Prefixes(dropg(n3)). 

— Hence dropg(n'j-(T3)) e Prefixes(dropg(n3))by induction, contradictingdropg(n'^-fflT) G Prefixes(dropg(n3)). 
• Case U.4: 

■ Subcase (Hi -Ta -Ha, Ti) = (Hi -Ta -H^ -Bt, e): 

t 

— Then (Hi • r2 • H'^ , T) ^ {H^ ,_),_, _, _, _. 

— The claim then follows by induction. 

■ Subcase (Hi -Ts -Hs, Ti) = {Il[-BT,e) where r2-n2 = e: 

— Then the claim is (2). 



□ 



Lemma A.8 (Trace actions stick around (rewinding version)). If 

• (n-t,r),_,_,_,_^* (n', T'), - 

• (dropH(n'),Ti);£0* (dropH(n), r2); T3 

then (dropH(n'),Ti);£0* (dropB(n-t), r2); T;^ O (dropg(n), T2) ; r3. 



Proof. Note that the rewinding takes at least one step, otherwise Lemmas A. 4 and A. 7 would yield Il-i G Prefixes(n), a 
contradiction. We inspect the last step: 

. Case (dropg(n'),Ti);eO* (dropg(n) -t', T2); O (dropg(n), T2) ; T3 with 73 = t'-T^: 



Lemmas 
Lemma 



A.4 and A.7 yield dropg(n-t) e Prefixes(dropg(n')). 



A.4 



also yields dropg(n-i') G Prefixes(dropg(n')). 
■ Hence t — t' and we are done. 

Case (dropg(n'),Ti);eO* (dropg(n) -B^, T^); r3 O (dropg(n), Ta) ; T3 
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Lemma A.4 yields dropg(n)-B^ S Prefixes(dropg(n')), which is a contradiction. 



By Lemma A. 10 



wegetcr',K',p',e' — > a" , k' ■ n" , p" ,e" . 



a 



Lemma A.9. //(n, T), cr, k, p, (n', T'), cr', k', p', a/ one/ noreuse((n, T)), then noreuse((n', T')). 

Proof. Easy induction on the length of the reduction. □ 

Lemma A.IO. //cr, k, p, ar — H> a' , k' , p' ,a/, then a, ko@k, p,ar a' , ko@k,' , p' ,a/ for any kq. 

Proof Easy induction on the length of the reduction. □ 

r * 

Lemma A.ll (CSA preservation (untraced)). If a, e, p, e — > a', k', p' , e' and CSA(cr, p, e), then CSA(ct', p' , e'). 
Proof 

• Suppose cr', e, p', e' — a", n" , p", e". 

• We must show SA(g", p", e"). 



□ 



Hence a, e, p, e — !• a" , k'-k", p", e". 
The goal then follows from CSA(ct, p, e). 



Lemma A.12. Suppose dropg(n) e Prefixes(dropg(n')), dropg(n-n) ^ Prefixes(dropg(n')) fl«^^ |k| = #□(!!)■ 

~ ~ t " ^ 

7. //(n-n-n,To),o-,K-[p/,/J-K,p,Q;r — ^ (n',T^),o-',K-K, p',^/, f/jen.- 

~ f rii ~ 

• (n.n.n,To),a,«:.Lp/,/j.7J,p,a^^ (n.n.n',T^'),'^",«-Lp/,/J,£,^ 

• (n-n-n',T^');£0* (n-n,r^");T 

• Pfif) =^fun /(x).e/ 

. (n.n.rF,T^'),^",'«-Lp/,/J,e,sj^ (n.(r),r^"),(T",K,p/[x^],e/ 

. (n-(T),T^">,a",At,p/[5r^^],e/ ^ (H', T^), a', p', a/ 

• n = ni + 1 + n2 

~ ^ t " 

2. //(n-D-n, TQ),a,K - [pf,f\ -K, e, prop — (11', Tl^),a' , k-k, p', a/, f/ien.- 

~ f ni ~ 

• (n-n-n,To),a,K-Lp/,/J-K,£,prop^ (n-n-n',T^'),a",K-Lp/,/J,e,a7 

• (n-D-n', T^');£0* {IV-U,T^');T 

• Pfif) ={un f(x).ef 

. (n-n-n',T^'),a",«-Lp/,/J,e,cj^ (n-(r),r^"),(T",K,p/[x^],e/ 

— + 71-2 

• (n-(T),T^">,a",At,p/[ar^],e/ ^ (n', T^), a', p', a/ 

• n = ni + 1 + n2 

Proof. By mutual induction on 71. If ri = 0, then we obtain a contradiction to dropg(n-n) ^ Prefixes(dropg(n')). So consider 
n > 0. In each part we inspect the first step of the reduction. 

1. • Case E.0-7: Straightforward, using the inductive hypothesis. 

• Case E.8: 

■ Subcase #n(n) = 0: 

- Then n-[pfJ\-K,p,a, = K-[pfJ\,e,uj and (n-D-H, Tq); e O* (n-n,T2);ri. 

- Hence (H-D-n, Tq), cr, k- [p/, /J , e,a; (H- (Ti), T2), ct, k, p/[x ^ w],e/ with 

^ n— 1 

(n-(Ti),r2),(T, K,p/[xT=ra;],e/ — ^ (n',T^),CT',K-/;, p',a/. 

- Thus the claim holds for rii = 0, 712 ~ n — 1. 
• Subcase #n(n) > 0: 

- ThenK-Lp/,/J-K,p,a, = K-[p/,/J-K'-[p,/j,e^aJand (n-D-H, To); e O* (n-D-n'-D, T^'); f . 

- So (n-D-n, To), a, K- [pf, f\ -K, e,Lu^ (n-D-H'- (f ), T^') , a, k- [p/, /J -H', p, e. 
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~ + n— 1 

- And {U-n-Il'-{T)X'),(^, «• [Pf, /J P,e^ {W, T^), a', k-k, p', a/. 

- Hence the claim holds by induction. 

• Case E.P: By induction (part 2). 

• Case P.E,l-7: Not possible. 

• Case P.8: 

■ Then p,a, = e,uJ and {U-D-U, e); e O* {n-D-n' -mr^, e);Ti. 

■So {n-n-Il,To), a,K-[pf J\-K,e,uJ^ {n-a-U-{Ti),T2),<J,K-[pfJ\-K,e, prop. 

~ 4- 71—1 

■ And (n-D-n- (Ti), T2), a, k- [pf, f\ -k, e, prop {W , T(,),a', k-k, p', a/. 
• Hence the claim holds by induction (part 2). 

• Case U.1-4: Straightforward by induction. 
2. • Case E.0-8^: Not possible. 

• Case P.E: By induction (part 1). 

• Case P. 1-7: Straightforward by induction. 

• Case U.1-4: Not possible. 

• Case P.8: Not possible. 

□ 

Lemma A.13. Suppose dropg(n) G Prefixes(dropg(n')), dropg(n-fflT/) ^ Prefixes(dropg(n')) and |k| = #□(!!). 

1. 7/(n-fflT'-n,To),cr,K-K,/9,ar (H', T|5), (j', K-K, p', a/, then: 

• (n-fflT'-n,To),Cr,K,/9,ar ^ (n-fflT'-n',£),a",K,£,aJ 

• (n-fflT'-n',e);e O* (n-fflrse);? 

• {n-mT'-n' ,£),a" ,K,£,uj {n- {T),T'), a" ,k,£, prop 

• (n-(T),r),a",K,e,prop^ (n',T^),<7',K-K,p',a/ 

• n = ni + 1 + ri2 

2. //(n-fflT' -n, To), (T, K-K, £, prop (n',T^),cr',K-K,p',a/, then: 

• (n-fflT/-n,ro),cr,K-K,e,prop — )> (n-fflT/-n',£),(7",K,£,a7 

• (n-fflT'-n',£);£0* (n-fflT',£);T 

• (n-fflT'-n',£),(T",«;,£,cU-^ (n-(f),T'),cr",K,£,prop 

• (n.(T),r),a",K,£,prop^ (n',T^),c7',K-K,p',a/ 

• n = ni + 1 + 712 

Proof. By mutual induction on n. If ?7 = 0, then we obtain a contradiction to dropg(n-fflx/) ^ Prefixes(dropg(n')). So 
consider n > 0. In each part we inspect the first step of the reduction. 

1. • Case E.0-7: Straightforward, using the inductive hypothesis. 

• Case E.8: 

■ Then K-K, p, a, = k-k'- [p, /j , £,tj and (H-fflT' -H, To); £ O* (n-fflr' -n'-D, T^'); f . 

■ So (n-fflT'-n,ro),o-,K-K,£,aJ-^ {li-mr'-^' ■{f),Tl^),a,K-K' ,pj. 

■ And (n-fflT' -n'- (f ), r^'), a, k-k', p,f^^~^ (n', T^), a', K-K, p', a/. 

■ Hence the claim holds by induction. 

• Case E.P: By part (2). 

• Case P.E,l-7: Not possible. 

• Case P.8: 

■ Subcase #ffl(n) = 0: 

- Then (n-fflr' -H, £); £ 0* (n-fflT',£);Ti and#n(n) = and thus k = e. 

- So (n - fflr' • n, To) , a, K, p, (n- (Ti ) , T2) , fT, K, £, prop. 

4. n— 1 

- And (n- (Ti), T2) , (7, K, £, prop (H', T^), ct', k-k, p', a/. 

- Thus the claim holds for ni = 0, 712 = n — 1. 
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■ Subcase #ffl(n) > 0: 

- Then (n-fflT-n,e);e O* {U-mT' -U' ■aT„e);Ti with#n(n') = #n(n). 

- So (n-fflT'-n,ro),cr,K-K,p,ar (n-fflT'-n'-(ri),T2),cr,K-K,e,prop. 

- And (n-fflT'-n'-(Ti),T2),cr,K-K,e,prop (n', T^), cr', k-k, p', a/. 

- Hence the claim holds by induction (part 2). 

• Case U.1-4: Straightforward, using the inductive hypothesis. 
2. • Case E.0-8,P: Not possible. 

• Case RE: By pai-t(l). 

• Case P.1-7: Straightforward, using the inductive hypothesis. 

• Case P.8: Not possible. 

• Case U.1-4: Not possible. 

□ 

Lemma A.14. Suppose dropg{Il-D) E Prefixes(dropg(n')) one/ #Q(n) = 

1. //(n-n-n,T),cr,K-K,p,ar (n',r'),cr',K',p',a/, then K e Prefixes(K'). 

2. //(n-n-n,T),CT,K-K,£,prop {U',T'),a',K',p\a/, then k e Prefixes(K')- 

Proof. By mutual induction on n. If n = 0, then we obtain a contradiction to dropg (11 •□) ^ Prefixes(dropg(n')). So consider 
n > 0. In each part we inspect the first step of the reduction. 

1. • Case E.0-7: Straightforward, using the inductive hypothesis. 

• Case E.8: 

■ Subcase #n(n) = 0: 

4- 71—1 

- Lemma A. 7 yields dropg(n-(T'i)) e Prefixes(dropg(n')), which contradicts the first assumption. 

■ Subcase #n(n) > 0: 

- Thenjn-n-n'-(T),r),o-,K-K',p,? (n',r'),cr',K',p',a/ with = |k| - 1 = #n(n) - 1 = 

#n(n') = #n(n'-(f)). 

- Hence the claim holds by induction. 

• Case E.P: By induction (part 2). 

• Case P.E,l-7: Not possible. 

• Case P.8: 

■ Then (n • □ • n' • (f ) , T) , a, K- e, prop ' (H' , T') , a', , a/ with | k| = #□ (H) = #□ (H') = #□ (H' •(?)). 

■ Hence the claim holds by induction (part 2). 

• Case U.1-4: Straightforward by induction. 

2. • Case E.0-8,P: Not possible. 

• Case RE: By induction (part 1). 

• Case P.1-7: Straightforward by induction. 

• Case U.1-4: Not possible. 

• Case P.8: Not possible. 

□ 

Lemma A.15. Suppose Arop^{Ii) e Prefixes(dropg(n')). 

^ n t ^ 

1. If (n, T), cr, Ki, p, Or — > (n', r'), o-', ki-k, p', a/, then (11, T), a, K2, p, — > (11', T'), a' , K2- n, p' ^ ol/ for any K2- 

^ n t ^ 

2. If (n, T), cr, Ki, e,prop — !• (11', T'),a' , ki-k, p', a/, then (11, T), cr, K2, £, prop — > (11', T'),a' , K2-k, p', a/ for any 

K2- 

Proof Mutually, by induction on n. If n = 0, both parts hold trivially. Now suppose n > 0. We inspect the first step of each 
reduction. 
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• Case E.0-5,7: By Lemma [ATT] (except E.O), induction, and application of the corresponding rule. 

• Case E.6: 

■ Subcase dropg(n-n) G Prefixes(dropg(n')): 

^ n — 1 

— We know = push / do e and (H-D, T), cr, ki ■ [p, /J , p, e — > (11', T'),a' , ki-k, p' , a/. 
we know ki ■ [p, /J e Prefixes(Ki -k). 



A. 14 



By Lemma 

Hence ki-k ^ k[-k' for k'i — ni ■ [p, /J and some n' . 

The claim then follows by induction and application of rule E.6. 



■ Subcase dropg(n-n) ^ Prefixes(dropg(n')): By Lemma A.12 Lemma A. 6 Lemma A. 4 Lemma A. 7 induction 
(twice), and rule E.6. 

• Case E.8: Lemmas \kA\ [a!6] and \kn\ yield both dropg(n" D) e Prefixes(dropg(n')) and dropg(n"-(_)) € 
Prefixes(dropg(n')), which is a contradiction. 

• Case E.P: By Lemma 



A.7 



• Case P.8: Lemmas A.4 



induction (part 2), and application of E.P. 



A.6 and A.7 yield both dropg(n"-ffl_) e Prefixes(dropg(n')) and dropg(n"-(_)) € 



Prefixes(dropg(n')), which is a contradiction. 

• Case U.1-4: By induction and application of the corresponding rule. 

• Case P.E,l-7: Not possible. 

• Case E.0-8,P: Not possible. 

• Case P.1-5: By Lemma A.7 induction, and application of the corresponding rule. 

• Case P.6: 



Subcase dropg(n-fflj;) g Prefixes(dropg(n')): By induction and application of rule P.6 
Subcase dropg(n-fflj5) ^ Prefixes(dropg(n')): By Lemma 
(twice), and rule P.6. 



A.13 



Lemma 



A.6 



Lemma 



A.4 



Lemma 



A.7 



induction 



• Case P.7,E: By Lemma [ATTl induction (part 1), and application of the corresponding rule. 

• Case P.8: Not possible. 

• Case U.1-4: Not possible. 



Lemma A.16 (CSA presei-vation (traced)). If 



□ 



1. (n, e), cr, K, p, e — > (n', _) , cr', k-k', p', e' 

2. dropg(n) e Prefixes(dropB(n')) 

3. noreuse((n, e)) 

4. CSA(a,p, e) 

then CSA(CT',p',e'). 
Proof. 



• By Lemma 

• By Lemma 



A.15 



we get (n,£),CT,e,p,e — > (H', _), cr', k', p', e'. 
A.9|that reduction does not use rules other than E.* and U.*. 

r * 

we get cr, e, p, e — > cr', k', p', e'. 



A.l 



Hence by Lemma 
The claim then follows by Lemma [A. 11 



□ 



Lemma A.17 (Decomposition). Suppose T fsc, from initial configuration (11, s) , cr, n, p, ar and producing v. 
l.IfT ^ Ai^.ri-T', then: 

t * 

(a) (n, e). cr, K, p, ttr — !• {H, e) , a, K, p' , let X = alloc (y) in e using ¥..0 only 

(b) T' fscfrom (If-yA^ „j, e), cr', k, p'[x i— > £], e, producing V 

(c) CT, p', alloc Cy-) a',£ 

(d) p'{y) = m 

2. IfT ^ F^il„i]-T', then: 



(a) (n, e), cr, K, p, Qfr 



(n, e) , cr, K, p', let X = read (y[z]) in e using E.O only 
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(b) T' fscfrom (H- R^[,„], e), cr, k, p'[x i— ^ i'], e, producing v 

(c) a, p' , read (y[z]) <t,i^ 

(d) p'{y) = £ 

(e) p' {z) = m 
3.IfT=Wi^^^yr,then: 

(a) (n, s), a, K, p, a, — *-> (H, e) , a, k, p', let _ = write (x[y],z) in e using E.O only 

(b) T' fscfrom {11 -W^^^-i^^e), a' ,K,p' ,e, producing V 

(c) a, p' , write (x[y],z) a',0 

(d) p'{x) = £ 

(e) p'{y) = m 

(f) P'{z) = V 
4.IfT= Mp,^^-T', then: 

t * 

(a) {H, e) , a, K, p,ar — > {H, e) , a, k, p' , memo e using F,.0 only 

(b) T' fscfrom (H- t^p',e, e), cr, k, p', e, producing V 

5. IfT = Up>,e-T', then: 

(a) (n, e), cr, K, p, Q!r — *-> (n, e) , cr, K, p', update e Mi;«g E.O o«fy 

(b) T' fscfrom (H- Up' £), cr, k, p', e, producing V 

6. IfT ^ (Ti)-T2, f/zen; 

fflj (n, e), cr, K, p, ttr — *-> (n, e) , cr, K, p', push / do e Mi/ng E.O on/y 

(b) Ti fscfrom (H-D, e), cr, k- [p', /J , p', e, producing u 

(c) T2 fsc from (H • (Ti ) , e) , _, k, p' [an=ra;] , e', producing v 

(d) p'if) = fun /(x).e' 

7. IfT ^ uj-T', then: 

t * _ 

(a) (n, e), cr, K, p, Q!r — !► {H, e) , a, K, p' , pop X using only Fj.O 

(b) p'{x) = U 

(c) r = e 

(d) uj = ly 

Proof. From the assumption we know that: 

(i) CSA(cr,p, Qfr) 

(ii) noreuse((n, e)) 

t " _ 

(iii) (n,£),cr, K, p, ttr — > {W ,e),a' ,K,e,V 

(iv) (n',e);eO* (n,£);r 

The proof is by induction on n. We are only interested in cases where T is nonempty and thus n > 0. In each part we inspect 
the first step of the reduction in (iii). 

l.T = Ae.rn-T' 



• Case E.O: By Lemma A. 16 and induction. 

• Case E.l: 

■ Then: 

(a) Q!r = let X = alloc (y) in e 

^ n— 1 

(b) (n-A^/^,„',e),(T",K, p[a; 1-^ £'],e — > (11', e), cr', k, e, F 

(c) cr,p, alloc (y) ^ a",£' 

(d) p(y) = m' 

• By (iv), Lemma [A!6| and Lemma |A.8| we get (dropg(n'), e); e O* (dropg(n-A£/^m/), e); T' O (dropg(n), e) ; T 
with T = Af',™' -T', hence £' ^ £ and m' = m. 

■ By Lemma A. 9 we know dropg(n') = 11' and dropg(n- A^ m) — H-A^ 

■Finally, Lemma A. 16 yields CSA(cr",p[a; £],e) and therefore T' fsc from (11 • A^' e), cr", k, p[a; i—> e, 
producing V. 
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• Case E.2-8: 



' Then {n-t,e) ,(t" , k, p" , a/' 



A.12 



in case E.6). 



- iiii^ii c/, u , n,, p ju;,. r (H' , s) , ct' , K 17 with < 7^ Afm (USing Lemma rT.. j. ^ ill >^aa>^ 

■ By (iv), Lemma |A.6| and Lemma |A.8| we get (dropg(n'), e); e O* (dropg(n-i), e); T' O (dropg(n), e) ; T with 
T = t-T'. 

• This is a contradiction. 

• Case E.P,P.E,P.l-8,U.l^: Impossible due to (ii). 

2. r = R,^[„.].r 

• Case E.O: By Lemma A. 16 and induction. 

• Case E.2: base case 

• Case E.1,3-8: contradiction 

• Case E.P,P.E,P.l-8,U.l^: Impossible due to (ii). 

3. r = W,%,,].T' 

• Case E.O: By Lemma A. 16 and induction. 

• Case E.3: base case 

• Case E.1,2,4-8: contradiction 

• Case E.P,P.E,P.l-8,U.l-4: Impossible due to (ii). 

4. T = Mp-,e-r' 

• Case E.O: By Lemma A. 16 and induction. 

• Case E.4: base case 

• Case E.1-3,5-8: contradiction 

• Case E.P,P.E,P.l-8,U.l^: Impossible due to (ii). 

5. r = Up.,e-T' 

• Case E.O: By Lemma A. 16 and induction. 

• Case E.5: base case 

• Case E.1-4,6-8: contradiction 

• Case E.P,P.E,P.l-8,U.l-4: Impossible due to (ii). 

6. r = (Ti)-r2 

• Case E.O: By Lemma A. 16 and induction. 

• Case E.6: 

■ Then = push / do e and {n-0,e) , a, K-[p, f \, p,e (II', e), cr', k, e,v. 

• By (iv), Lemma|A.6| Lemma|A.12|and Lemma|A.9|we get: 

I "i _ 

- {n-n,e},a,K-[pJ\,p,e — > (H", e), cr", k- [p, /J , e, w 

- (n",£);eo* (n.n,e);r 

- p{f) = fun f{x).ef 

- {n",e),a",K-[pJ\,e,oJ^ {U-{f),e),a",^,p[x^],ef 

— (n- (T), e), cr", K, p[x i-> w], ey — > (II', e), cr', k, e, 17 

— 71 — 1 = Til + 1 + 7i2 

■ By (iv), Lemma[A!6|and Lemma[A!8]we get (dropg(n'), e); e O* (dropg(n- (f )), e); T' O (dropg(n),e) ;T with 
T = {T)-T' and thus Ti = T and T2 = T . 

• Hence by Lemma A. 9 and Lemma [A. 16 we know: 

— Ti fsc from (H-D, e),a, n- [p, /J , p, e 

— T2 fsc from (n-(T), e), cr", k, p[xT=ra7], e/ 

• Case E.1-5,7,8: contradiction 

• Case E.P,P.E,P.l-8,U.l^: Impossible due to (ii). 

7. T = uj-T' 



• Case E.O: By Lemma A. 16 and induction. 

• Case E.7: 

■ Then = pop x and (II, e) , a, k, p, pop x — 

■ We show that ?i — 1 = 0: 

— For a contradiction, suppose that n — 1 > 0. 



(n-w', e), cr, K, e, uj' 



n-l 



(n', e), cr', K, £, V, where uj' = p{x). 
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— Note that then the next reduction step must be either P.8 or E.8. 

— In either case, using Lemmas A.4 A. 5 and A. 7 we would get a contradiction to (H', e); e O* (H, e); T. 

• Hence uJ' — V. 

• Furthermore, Lemmas A. 6 and A. 8 yield T = cj' and thus oJ = (7 and T' ~ e. 

• Case E.1-6,8: contradiction 

• Case E.P,P.E,P.l-8,U.l^: Impossible due to (ii). 



Definition A.8 (Last element of a trace). 

last(i-r) = last(T) T^e 

last(i-e) = t 

last(£) undefined 

Lemma A.18 (Evaluation values). IfT fsc producing values V then last(T) = V. 

Proof. By induction over the structure of T. 

• Case T = e: 
■ Not possible. 



□ 



• Case T = uj-T: By Lemma A.17 



□ 



• Case T = t-T' with t not a value: By Lemma A.17 and induction. 

Lemma A.19 (Propagation values). // 

(a) (n,ri),cr, K,e,prop — > (H', cr', k, e, 17 

(b) (dropg(n'),-);eO* (dropg(n), _ 

(c) reduction (a) does not contain a use o/P.E 

Then last(Ti) = 17 

Proof. By induction on the number of reduction steps n. Note necessarily that n > 0. We inspect the first reduction step of (a). 



• Case E.0-E.8,U.1-U.4,P.8,E not possible, due to (c). 

• Case R1-P.5 

■ Then (11, t-Ti),a, k, e,prop (11 -t, Ti),a' , k, e,prop — 



n-l 



(n',_),cr',K,e,17withTi = t-Ti 



From Lemma 
From Lemma 



A.13 



A.8 



■By Lemma A.8 we have that (dropg(n'), e O* (dropg(n-t), Ti) O (dropg(n), _ 

■ The claim follows by induction. 

• Case P.6 

t t n—l 

■ Then (n, (T2) •Tg), fT, k, e, prop ^ {U-mT, , T^),^, k, e, prop ^ (H', _), a', e, V with Ti = {T2) -Tg 

t "1 t "2 _ 

we have (11 • , 12), cr, k, e, prop — > (II- (T^), Tg), ct, k, e, prop — > (11', _), cr', k, £, 17 

we have that (dropg(n'), _); _ O* (dropg(n.(T^)), _) ; _ O (dropg(n), _) ; _ 

■ The claim follows by induction. 

• Case P.7: 

t t n— 1 _ 

■ Then (11, w-e), cr, k, e,prop — > (H-w, e), cr, k, e,uj — (11', e), cr', k, e,V with Ti =uj-£ 
• We show that n — 1 — 0: 

— For a contradiction, suppose that n — 1 > 0. 

— We inspect the next step in n — 1, which must be either P.8 or E.8. We assume E.8; P.8 is analogous. 



(n',_),cr',K,e, J/ where {dropg{Il-uj),e);e O* 



— Hence (11 -w, e) , cr, k, e, oJ — > {U" ■{T),T') ,a, k' , pf ,ef — > 
(dropg(n".n),T');r. 

— By Lemma [0| we know e Prefixes(n-w), i.e., e Prefixes(n). 

— Hence dropg(n"), dropg(n"-n) e Prefixes(dropg(n)) C Prefixes(dropg(n')) using Lemma A.4 (d), and 
LemmalA.51 
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— UsingLemmaA.7 wegetdropg(n"-(T)) <E Prefixes(dropg(n')), contradicting dropg (11" •□) G Prefixes(dropg(n 
■ Hence, since n = 1 we have that 

— n j7 = n' 

— LJ = !> 

• Moreover, last(Ti) = \ast{V-e) = v 

□ 

Lemma A.20 (Case analysis: waking up before push action). If 

(a) (n, r), CT, K, £,prop — > (n', e), cr', K, £,I7 

(b) (dropg(n'),-);-0* (dropg(n),_);_ 

(c) n ok 

(d) T fsc 

(ej r = Ti-(T2)-T3 

(f) Ti contains no parenthesis (i.e., (_) ^ Ti) 
Then either: 

t "1 



(n", Up.e-n ■ (Ta) -Ta), a", k, e, prop 



n"-typ,e,T{-(T2)-r3),(T",K,p,e 



1. • (n, r), (T, K, e,prop 

(n',e),cr',K,£,I7 

. (dropg(n'),->;-0* (dropg(n".t;p,e),_>;_0* (dropg(n), _) ; _ 

• n" ok 

• t;p,e-r{-(T2)-r3 fee 

• last(r{-(T2)-T3) = last(T) 

• ri = rti + 1 + ri2 

2. • (n, T), cr, K, £,prop > (n", T3), cr", K, £, prop ^ (n',£),(T',K,£,I7 

. (dropg(n'), -) ; - O* (dropg(n"), _) ; _ O* (dropa(n), _) ; _ 

• n" ok 

• fsc 

• Iast(r3) = last(T) 

• n = ni + n2, ni > 

Proof. By induction on the number of reduction steps n. Note necessarily that n > 0. We inspect the first step taken. 

• Cases E.0-E.8, E.P, U.1-U.4: not possible. 

• Case P.1-P.5: 

t t n — l 

■ ThenT = tT' and (H, t-T'), ct, k, £, prop (H-t, T'), ct, k, £, prop (H', £) , ct', k, £, 17 



t "2 



Hence, (dropg(n'), _); _ O* (dropg(n-i), _); _ O (dropg(n), _); _by LemmajA^ 
From n ok, we have li t ok 
Note that T = T[- (Ta) ■T:^ where Ti = i -T^' 
Hence, from (f) we have that T{ contains no paranthesis 
From T fsc we have T' fsc using Lemma 



A.17 



■ The claim then follows by induction. 
Case P.E: We show claim (1) as follows: 

■ Then T = Up,e • T[ ■ {T2)T-i and (H, Up,e • T') , a, k, e, prop (H- Up,e, T') , ct, k, £, prop (H', £) , cr', k, £, 17 



■ Hence, (dropg(n'), _); _ O* (dropg(n-Up,e), -); - O (dropg(n), _); _by Lemma|A^ 

■ With ni = 0, claim (1) follows immediately by assumptions (c), (d) and (e). 
• Case P.6: We show claim (2) as follows: 

t t ri—l 

■ Then T = {T2)-T^, = £, and (H, {T2)-T^),(J, k, £,prop ^ (H-fflTs, ^2), ^r, k, £, prop ^ (H', £),(!', £, V 
• From Lemma |A.13| we have: 

(i) (n,(r2)-r3),a,K,£,prop^ (n-fflT3,r2),a,«:,£,prop^ (H- (T^), r3), a", £, prop ^ {n',e),a',K, 

(ii) n = 1 + mi + m2 



Hence (dropg(n'), £) ; _ O* (dropg(n- (Tj)), -) ; - O (dropg(n), _) ; .using (b) and Lemma A.8 
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■ From n ok we have 11 • {T2) ok 

■ From Lemma [A.17| and (d), we have T3 fsc 

■ Finally, by definition Iast(r3) = last((T2) -Ts) = last(r). 

■ This completes the case, showing claim (2) with ni = 1 + nii and n2 — m2- 

• Case P.7: not possible; it contradicts assumption (e). 

• Case P.8: not possible; it contradicts assumption (a). 

□ 

Lemma A.21 (Case analysis: final (non-nested) awakening). If 

t " _ 

(a) (n, r), (T, K, e,prop — > (11', e), a', k, e, F 

(b) (dropg(n'),-);-0* (dropg(n),_);_ 

(c) n ok 

(d) T fsc 

Then either: 

1. • (n,r),a,K,£,prop-^ (n",typ.e-r'),a",K,e,prop^ (n"-t;p.e,T'),f7",K,p,e-^ (n',£),a',K,e,I7 
. (dropg(n'),-);-0* (dropg(n".t;p,e),_);_0* (dropg(n), _) ; _ 

• n" ok 

• Up^e-T' fsc 

• last(r) = last(T') 

• ri = rti + 1 + ^2 

f Til f- "2 

2. • (n,r),cr,K,e,prop (H", T'), cr", k, e, prop (H', e), cr', k, e, F 
. (dropg(n'), -) ; - O* (dropg(n"), -) ; - O* (dropg(n), _) ; _ 

• n" ok 

• r fsc 

• last(r) = last(T') 

• n = m + 712 

• Reduction rii contains no use q/P.E 
Proof. Case analysis on the shape of trace T: 

• Case: , r2 , T3 such that T = Ti • (T2 ) • T3 and Ti contains no parenthesis. 

■ Applying lemma [A.20| we get subcases (i) and (ii): 

(i) - (n,r), prop ^ (n",Up,e-r{.(T2)-T3),a",«,£,prop ^ (n".Up,e,Ti'-(T2)-T3),a",K,p,e ^ 

(n',e),cr',K,e,I7 

- (dropg(n'),-);-0* (dropg(n".Up,e),_);_0* (dropg(n), _) ; _ 

- n" ok 

- Up,e-T{-(r2)-r3fsc 

- last(r{-(r2)-T3) = last(T) 

- n = ni + 1 + ri2 

■ This immediately shows claim (1). 

t "1 t "2 _ 

(ii) - (n,r),cr, K,£,prop — > (n",T3),cr",K,e,prop — > (n',e),(T',K,e,i7 

- (dropg(n'), -) ; - O* (dropg(n"), -) ; - O* (dropg(n), _) ; _ 

- n" ok 

- T3 fsc 

- Iast(r3) = last(T) 

- n = ni + ^2, > 

■ Since we have that 712 < n, we continue by induction on reduction 712, which shows the claim. 

• Case: Otherwise: Note necessarily that T — i^- ■ . - -tn^ such that Vi. ti 7^ (_) 

■ Subcase: reduction (a) contains a use of P.E: 

- Hence, 3ti = Up^ such that T = ti- . . .-ti - . . .-tm and 11" = Il-ti •. . .■ti_i 
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— Then, since 11 ok we have 11" ok 

— Moreover, last(T) ~ last(ti-. . .-ti-. . .-tm) = Iast(i2-- • --ti-. . --tm) = \ast{ti-. . .-tm) — last(ii+i-. . .-tm) 

— Since T' = U+i ■ . . . -t^, we have that last(r) = last(T') 

— We get the rest of claim (1) from repeated use of Lemmas [A.8| and p\.17| (the number of required uses is z — 1). 
■ Subcase: reduction (a) contains no use of P.E: 

— Then we have claim (2) immediately, with ni — 0. 

□ 



Theorem A.22 (Consistency). 
1-If 

(a) (n2,T{),cr2,K2,e,prop (H^, r{'), ct^, ^2, e, z?^ 

(b) U2 ok 

(c) T{ fsc, from initial configuration (Hi , e) , di , ki , pi , ar 1 

(d) (dropH(ny,_);£0* (dropg(n2),_);r^ 
then for any Us there is Yi'^ such that 

(i) (n'2,T{') ok 

f * 

(ii) (n3,e),CT2|gc,K3,pi,Q;ri — > (H!, , e) , (7^ |gc, K3 , e, J^2 
m (n'3,e);eO* (n3,e);T^ 

2. If 

(a) (n2,T{),(T2,K2,P2,ar2 > (IIj , T{') , CTa , K2 , 

(b) (n2,T{) ok 

(c) (dropH(ny,_);£0* (dropg(n2),_);r^ 
then for any 113 there is Ilg such that 

(i) (n'2,T{') ok 

(ii) (n3,e),cr2|gc,K3,p2,ar2 > (Hg, s) , fJ^lgc, K3, 

m (n'3,e);eO* (n3,e);T^ 



Proof. By simultaneous induction on n. 

1. Note that necessarily n > 0. We inspect the first reduction step of (a). 

• Case P.l: 

---- ^ ---- ^ n— 1 

■ThenT{ = A,,„,Tiand(n2,Ti'),a2,'«2,e,prop^ (n2-A,,,„,Ti),^,K2,e,prop^ (n^,Ti") , cr2,K2,£,I^2, 

where cr2 , e, alloc (to) — ^ , 
■Hence (dropg(n'2),_);£ O* (dropg(n2-A,,„), _);T^ O (dropg(n2), _) ; A,,,„ -T^ with - A,,„ • 5^ by 

Lemma [ATS] and (d). 

■ By Lemma A. 17 and (c) we get: 

- (Hi, e), (Ti, Ki, pi, Qfri (Hi, e), (Ti, Ki, p'l, let x = alloc (y) in e using E.O only 

- Ti fsc from (Hi • A^^„, e), crj , ki, p'^ [a; ^],e 

- CTi, pi, alloc (y) a'l^i 

- p'liy) = m 

• From (b) we get 112 ■ ^i,m ok. 

■ By induction then: 

(i) (n^,Ti") ok 

(ii) {n3-Ai^jn,£),o^\gc,K3,p[[xi-^i],e-^ (n^, e), CTjIgc, K2, e, 

(iii) {U'^,e);e(3* (n3-A,,™, e); O (03, e) ; 

■ From (T2,e, alloc (m) ctJ, ^ and the knowledge about p'^ follows 0-2 1 gc, p'l, alloc (y) SJIgc,^- 

■ Hence, using Lemma A.3 (n3, e), cr2|gc, ^3, Pi, {n^3-^e.ni,£),'S^\gc, K3, p'i[x ^ i],e (Hg, e) , cTjIgc ^2, e, 

• Case P.2: 
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■ Then T[ = R'i^^fi and (Ha, T^) , ^2, '«2, e, prop ^ (Ha • R,%] , Ti) , ^2, «:2, e, prop ^ (H^, Tf), a^, ^2, £, I^, 

where (72, e, read [to]) ^^•o'2,'^- 

■Hence (dropg(n^), _); e O* (dropH(n2- R.^]), -); 5^ O (dropB(n2), _) ; R.^,, -5^ with = R,^[,„] • by 
Lemma [ATS] and (d). 

■ By Lemma A. 17 and (c) we get; 

- (Hi, e), (Ti, Ki, pi, ttri (Hi, e), (Ti, Ki, p'^jlet a; = read(j/[z]) in e using E.O only 

- Ti fsc from (Hi • R£[,„] , e) , (Ti , ki , p'^ [a; H> i^] , e 

- cri,pi,read(2/[z]) ^o-i,z/ 

- p[{z) = TO, 

■ From (b) we get 1X2 • Re[„i] ok. 

■ By induction then: 

(i) (n^,T{') ok 

t * 

(ii) (n3-R^[„],e),cr2|gc,K3,Pi[a; H^e — > (HJ,, e), CTjIgc, ^2, e, 

(iii) (n;,, e)-eO* (Ha • R.^j , e) ; O (Hg, e) ; 

■ From (72, £, read(^[m]) — > 72, and the knowledge about p[ follows a2|gc, p'l, read(i/[z]) — !• cr2|gc, i^- 

■ Hence, by Lemma A.3 (Hg, e), cr2|gc, K3, pi, a^i (Hg • R^[„] , e), cr2|gc, '«3, /oi[a; i-^- e (H^, e), CTalgc ^2, £, 
Case P.3: 

— ~ f - — . f n— 1 

■ThenTi' =W,"[„]-Ti and (n2,r{), (72,^2, e, prop (n2-W^[„,],Ti), ^,At2,e, prop (n!,, TO, cr^, ^2, e, I^, 

where (72, e, write(^[TO],i/) ctJ, 0. 
■Hence (dropg(n;,), _); e O* (dropg(n2-W,V]), O (dropg(n2), _) ; W,^[„] -T^ with = W.^j^j-T^by 

Lemma [ATS] and (d). 

■ By Lemma A. 17 and (c) we get: 

t * 

- (Hi, e), (Ti, Ki, pi, an — !• (Hi, e), (Ti, ki, p'^, let _ = write in e using E.O only 

- Ti fsc from (Hi • W^[„] ,e),a[,Ki,p[,e 

- CTi,p'i, write -^0-^,0 
-p[{x)^£ 

- p'liy) = m 

- P'liz) = 

■ From (b) we get n2-W£[,„] ok. 

■ By induction then: 

(i) (n^,T{') ok 

(ii) (n3-W^[„],e),cr2|gc,K3,pi,e (njj, e), cTslgc, ^2, e, I^. 

(iii) (n^,e);eO* (n3-W,^[„,j, e); O (n3, e) ; 
From (72, e, write (z^[^],to) — > a'2,0 and the knowledge about p'^ follows cr2|gc, pi, write — !• (J2\gc, 0. 

(n3,e),Cr2|gc,K3,Pl,arl (n3-W^[,„],e),72|gc,K3,P'l,e (n3,£),CT2lgc,'«2,e,i^ 



■ Hence, using Lemma A.3 
Case P.4: 



Then r{ = Mp,e-Ti and (02, r{), (72, ^2, e, prop (02 • Mp,e, Ti), (72, K2,e, prop ^ (H^, T{'), (t^, ^2, £, J^- 



■ Hence (dropg(ny, _); e O* (drapg(n2-Mp_e), _);T2 O (dropg(n2), _) ; Mp,e-72 withT^ = Mp^e'^^by Lemma|A.8 
and(d). 

■ By Lemma A. 17 and (c) we get: 

— (Hi, e), CTi, Ki, pi, ari (Hi, e), (71, Ki, p, memo e using E.O only 

— Ti fsc from (Hi • Mp,e, e), cri, ki, p, e 

■ From (b) we get 112 • Mp ^ ok. 

■ By induction then: 

(i) (n^,T{') ok 

(ii) (n3-Mp,e,£),(72|gc,K3,P,e > (nJj,£),(7^|gc,K2,e,Z^2 
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(iii) {U'3,e);eC>* (ns-M,,,, e); (n3,e);rj 



(n3,£>,Cr2|gc,K3,Pl,arl (Ha • Mp^g , e) , (Ta Igc, Kg, p, 6 (nJj,£),Cr^|gc,K2,e, 



^^2 



■ Finally, using Lemma A. 3 

• Case P.5: 

^ ( ri— 1 

■ ThenT{ = Up,e-Ti and (Hs, r{), (72, K2, £, prop — > (Ha • Up^e, Ti), cr2, ^2, e, prop — > (H^, T{'), cr^,, K2, £, i^. 

■Hence (dropH(n'2), _); £ O* (dropg(n2 • Up,e), _); 5^ O (dropH(n2), _) ; Up,e-T^ with = Up,e-J^by LemmajXi 
and (d). 

■ By Lemma A. 17 and (c) we get: 

- (Hi, e), cTi, Ki, pi, ttri — (Hi, e), (Ti, Ki, p, update e using E.O only 

- Ti fsc from (Hi • Up,e , e) , , ki , p, e 

■ From (b) we get 112 • ^p,e ok. 

■ By induction then: 

(i) (n^,T{') ok 

(ii) (n3-Up,e,£),Cr2|gc,K3,p,e > (H^,, e) , CT^ |gc, ^2, £, J^2 

(iii) {Il'^,e);eO* (n3- Up,e,£);T^ O (n3,£) ;Tj 

■ Finally, using Lemma|A3j (113, e), 0-2 1 gc,K3, Pi, "n — > (Ha- Up,e, e), (T2|gc, K3, p, e — > (IIJ,, e), cr^|gc, K2, £, J^2 

• Case P.6: 

■ 4. ■ f ri— 1 

■ Then T[ = (Ti ) • Ti and (02 , r{ ) , ^2 , ^2 , £, prop — > (02 • fflj^ , Ti ) , a2 , K2 , £, prop ^ (H^ , Tf) , ct^ , ac2 , £, i^. 
• By Lemmas | A.4| and p\. 1 3 1 we get: 

- (n2-ffl5^,ri),cr2,K2,£,prop (n2,e),o^,K2,£,'^ 

- (n^,e);eO* (n2-%,£);r ^ 

- (n2,£),CTj,K2,£,I7 (n2-(T),ri),CTj,K2,£,prOp 

- (n2-(T),ri),(72,AC2,£,prop-U (n^,T{'),(T^,K2,£,I^ 

- n— 1= 711+1+712 

■ By Lemma A. 17 and (c) we get: 

t * 

- (ni,e),cri,Ki,pi,ari — > (Hi, e), cri,Ki,p, push /doe using E.O only 

- Ti fsc from (Hi • □, e) , cri , ki • [p, /J , p, e, producing value w 

- Ti fsc from (Hi • (7\), e), cri, ki, p', e' 

- p(/) = fun fix).e' 

- p' = p[an=ra7] 

■ Since 112 ok and Ti fsc, we know 1X2 • ffl ok. 



Furthermore, (n2,£);£0* (112 ■ ffljr ,£); T implies (di'opg(n2), e); e O* (d ropg (112 ■ ^fr ) , £) ; T by Lemma |A.6 
Induction with ni then yields: 

- (fb,£> ok 

t * — _ 

- (n3-n,£),cr2|gc,K3-[p,/J,p,e ^ (n3,£),a2|gc,K3-Lp,/J,£,i^ 

- (ff^,e);eO* (n3-n,£);r 
Since 112 ok we get 1X2 • (T) ok. 



We get (dropg(ny,_);eO* (dropg(n2 • (T)), _); T2 O (dropg(n2), _) ; (T) •r2 with = (T) -Ta by Lemma|A.8 
and (d). 

Induction with 712 then yields: 

-(n^,T{')ok 

- (n3-(T),£),o^|gc,K3,p',e' (n;5,e),cr^|gc,K3,£,I^ 

- {n'3,e};eO* (n3-(T),£);T^ O (n3,e) 

I n t * 

Finally, using Lemma A.3 (113, e), a2|gc, K3, Pi, "ri — > (Ila-n, e), (T2|gc, K3- Lp, /J , P, ^ 

t * — _ 
(n3 •□,£>, (721 gc, K3 • [p, /J , p, e — ^ (n3, e) , (T2|gc, K3- Lp, /J , £, 

(n3,e),CTj|gc,K3-Lp,/J,£,;^ (n3-(r),e),CTj|gc,K3,p",e', where p" = p[x ^ v] 



It remains to show that u = v and thus p' = p" 
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t "1 



■ Recall that we have: 

- (n2-ffl5r,Ti),cr2,K2,£,prop 

- (n^,e);eO* (n2-%,£);r 

- n2-ffl5^ ok 

- 5\ fsc 

■ By Lemmas A. 6 and A. 21 we get two subcases: 
(a) First subcase (of two) 

— In this subcase, we have that: 

( mi 

(112 -ffljr, Ti) , (72, K2, e, prop — > (n", Up,j-r'), cr^, K2, £, prop 

(n", Up,e -T') , K2 , £, prop ^ (n" • Up,e, T') , K2 , P, e 
(n"-U^.g,r'),cr^,K2,p,e > (n2,e),cr2,K2,£,l^ 

(dropg(n^),_);_0* (dropH(n".Up,e),-);-0* (dropH(n2 -fflj^), _) 

n" ok 
Up,g-r' fsc 

last(Ti) last(r') 

Til ~ nil + 1 + 7712 

By Lemma [Al7] we get (H" • Up,j, T') ok. 

From induction on reduction TO2 using part (2), we get: 

' — ^ t * _ 

■ (e,e),cr^|gc,K2,P,e — s- (114, e), (T2|gc, ^2, £, 

• (n4,£);£0* (£,£);- ^ 

we have that (£, e), cr2lgci P, e 
we have that: a 



From Lemma 
From Lemma 
Next, since U 



A.15 



A.9 



(n4,£),a5Lc,£,£,i^ 



A.l 



£,p,e 



0-2^0, £,£, 



A. 17 



there exists Us, 0-5, K5, ps, 65, Ilg, and w' such that 



and Lemma 
p^e • T' fsc, with the help of Lemma 

CSA(CT5,p5,e5) 
noreuse((n5, £)) 

t * 

(115 , £) , , K5 , p5 , 65 — ^ (lis , £) , 0-5 , K5 , p, update e using E.O only 

T' fsc from (Hs-Up^e, £), cts, K5 , p, e producing w' 

t * 

(n5-U^,g,£),cr5,K5,p,e — > {n'r„e),a'^,K5,e,uj' 

(n'5,£);£OMn5-Up,e,£);T' _ _ 

From T' fsc, Ti fsc and last(r') = last(ri) we have uj' = last(T') = last(ri) = w by Lemma 
By Lemma A. 16 we get CSA(cr5, p, update e) and thus SA(t75 , p, update e). 



A.18 



— From Lemmas 


A.4 


A.5 


A.9 


A.15 


and 


A.l 


we have (75, £, p, e — a'r-,,e,e,u}. 


— Since also (72 Igc, £, p, — > f2|gc, 
(b) Second (and last) subcase: 

— In this subcase, we have that: 


£, £, we have that = a; by definition of SA. 



• (n2-fflf^l), 0-2, K2,£, prop > (n",T'),(7^,K2,£,prOp > (n2,£),0^, K2,£, 

• (dropg(n2),_);_0* (dropg(n")-,;)-0* (dropg(n2-ffl5^), _) ; _ 
■ last(Ti) last(r') 

• Reduction 7712 contains no use of RE 

Applying Lemma A. 19 to the reduction ?Ti2 we have that last(T') ~ v. 
- Putting this together, we hav e that last(ri) = last(T') = V. 

to "Ti fsc from . . . producing w", we have that last(ri) = uj. 



A.18 



— Finally, by applying Lemma . 

— Hence, uj = V. 
• Case P.7: 

t t 

■Thenr{ = i^T and (112, T^') , (72, K2, £, prop — > {n2-W,£),(T2,K2,£,W — > (IIj, T"), (72, K2, £, i^. 

■ We show that n — 1 = 0: 

— Assume the contrary. The only reduction rules that apply to (112 I^, £),'^2,i'^2,£,i^ E.8 and P.8. We consider 



only the former case; the latter is analogous. 
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J ). n—Z 

Hence (n2-^T, e), (72, i^2,£,vi — > (112 ■ (T), T'), (T2, — ^ (IIj, Tf), 0-2, K2, e, 1^ where (n2-^T, e);e O* 
(n^'-n,T');T. 

By Lemina|A.4|we know Hj-D G Prefixes(n2 -^T), i.e., Hj-D G Prefixes(n2). 



— Hence dropg(n2), dropg(n2-n) G Prefixes(dropg(n2)) C Prefixes(dropg(n2)) using Lemma A.4 (d), and 
Lemma lA.51 

— UsingLemma A.7 wegetdropg(n2 -(T)) e Prefixes(dropg(n2)), contradicting dropg(n2-n) G Prefixes(dropg(n2)). 
Hence = and a'2 ~ <T2 and 112 = ^2'V2- 

By inversion on (d) we get T2 = V2- 
(112 ■V2; e) ok follows from (b). 

By Lemma A. 17 and (c) we get (Hi, e), ai, ki, pi, ctri ^ 
P'i(^) = vi. 



(Hi , e) , CTi , Ki , p'^ , pop a; using only E.O, where 
■ Hence, using Lemma A. 3 (Ha, e), cr2|gc, K3, Pi, (lis, e), cr2|gc, K3, pi, pop x (Ha -I?^, e>, tJalgc ^3, e, 1^2- 



ri-l 



■Finally, (Hg-I^, e); e O* (n3,e);I?^. 

• Case RE: 

■ Thenr{ = Up.e-^i and (02, CT2, K2, e, prop (02 • Up,e, Ti), 0-2, ^2, p, e (H^, r{') , cr^, K2, e, 

■Hence (dropg(n'2), e O* (dropg(n2-Up,e),-);^ O (dropg(n2),-) ; Up,e-T^ withT^ = Up,e-T^ by Lemma|A!8 
and (d). 

■ By Lemma [A.17| and (a) we get: 

t * 

— (Hi, e), (Ti, Ki, pi, Ctrl — > (Hi, e), tJi, Ki, p, update e using E.O only 

— Ti fsc from (Hi • Up^e , e) , ci , ki , p, e and thus Ti ok 

■ n2- Up.e ok follows from 112 ok. 

■ Induction and part (2) then yield: 

(i) (n^,T{') ok 

t * 

(ii) (n3-Up,e,e),cr2|gc,'«3,P, e — > (If!;, e), a^lgc, K3, e, 1^2 

(iii) (n;5,e);eO* (n3-U^,£);T^ O (n3,e) 

I t * 

■ Finally, using Lemma |A.3[ (113, e), cr2|gc, K3, Pi, — > (Ila-Up^e, e), o-2|gc, K3, P, e. 

• Cases E.0-8, E.P, P.8, U.l-^: not possible 

2. Case analysis on n. First, we handle the simple case when n — Q: 

• Since n = 0, we have that: 

■ (n2,T0-(n'2,rr) 

■ (72 = CTj 

■ P2 = e 

■ a,2 = V2 

• T2 = e, by inversion on (c) 

• {n'2,T[') ok is given. 

• Pick n;, = 03. 

• Thenreflexively we have that (113, e),cr2|gc,K3,P2,ar2 — > (Hg, e), cr2lgc, ^^3, e, '^2- 

• Similarly, reflexively we have that {U'^, e);e O* {II3, e); T^^. 
For n > we inspect the first reduction step of (a): 

• Case E.O. 

t t n—l 

■Then(n2,r{),a2,«:2,P2,e" (02, T{), ^2, a«2, p'2, < (H^, Tf) , a^, ^2, £, 

■ Induction yields: 

(i) (n^,T{') ok 

(ii) (n3,e),Cr2|gc,K3,p^,Q;r2 — > (n;5,e),CT^|gc,K3,e,J^2 
(iii) (n^,e);eO* (n3,e);T2 

and (ii) we have that (n3,e),(T2|gc,K3,P2,e'' (113, e), cr2|gc, K3, P2, ar2 ~^ 



■ Finally, using Lemma 

(n3,e>,Cr^|gc,K3,£,l^ 

• Case E.l: 



A.3 
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^ J. n— i 

■ Then (112, r{), (T2, ^2, P2, let x = allocCy) in e — > (n2-Af,,„,r{),cr2,K2,P2; e — > (II'j, Tf), ct^, k2,s,T^ 
Where: 

- cr2,p2,alloc(?/) ^ai,£ 

- P2{y) = m 

- P2 = P2[x e] 

• From 112 ok we have that 112 • ok 

■By Lemma A.8 and (c) we have (dropg(n^), _); e O* (dropg(n2 • A^,™), _) ; O (dropg(n2), _) ; with 
T2 — Af „i •T2 

■ By induction then: 

(i) (n^,T{') ok 

(ii) (n3-A£,™,£),aJ|gc,K3,P2:e (II!,, e), cr^lgc K3, £, 1?^ 

(iii) (n'3,e);eO* (03 • A,,™, e); O (03, e) ; 

■ Hence, (113, e), cr2|gc, K3,P2,let x = allocCy) in e (n3-A£,„,£),CTj|gc, ^3,^2,6 (n!j,e),o-^|gc,K3,£,'^ 

• Case E.2: 

^ ( n— 1 

■ Then (02, r{), (T2, ^2, P2, let x = read(y[z]) in e — > (n2-R£[„],r{),cr2,K2,P2>e — > (H^, T^')> cr2) «^2, e, 
Where: 

- CT2,p2,read(?/[z]) ^o-2,i^ 

- P2{y)=i 

- p2{z) = m 

- 92= P2[x ^ v] 

• From 112 ok we have that 112 • ^£[m] ok 

■By Lemma jX^s] and (c) we have (dropg(ny , _); e O* (dropB(n2-R^[„]), _);f^ O (dropg(n2), _) ; with 

-'2 — ^l[7n]'-'-2 

■ By induction then: 

(i) (n^,Tn ok 

(ii) (n3-R^[„],e),cr2|gc,K3,P2'e — > (Hg, e), cTalgc, ^3, e, 1^2 

(iii) (n'3,e);eO* (03 • R.^] , e); 5^ O (n3, e) ; 

■ Hence, (n3 , e) , 0-2 1 gc , Ka , /02 , let X = read (y[z]) in e {Ha- R^[„] , e) , (72 1 gc , K3 , P2 - e (HJ, , e) , cr^ | gc , K3 , e , 

• Case E.3: 

t t 

■ Then (n2 , T{ ) , (72 , K2 , P2 , let _ = write (x [y] ,z ) in e ^ (02 • W^[„] , r{ ) , ^J, K2 , P2 , e ^ (H^ , T{') , (7^ , K2 , e, 
Where: 

- (72, P2, write (x[y],z) ^oj, 

- P2{x)^e 

- P2{y) = "T, 

- P2{z) = 1/ 

» 1 „ .U„. TT 



From 112 ok we have that 112 • W^^f^i ok 



■ By Lemma |AJ and (c) we have (dropg(n^), _); e O* (dropg(n2-W^^[,„]), r2 O (dropg(n2), _) ; with 
T2 = W^[„j] •T2 

■ By induction then: 

(i) (n'2,Ti") ok 

(ii) (n3-W^[„],e),CT5|gc,K3,p2,e — > (H!,, e), cr^lgc, K3, e, 17^ 

(iii) {IV'^,e)-eQ* (n3-W,^[„,j,e);?^0 (n3,£);r^ 

■ Hence, (03, e) , (72|gc, K3, P2, let _ = write (x[y],z) in e (n3-Wj'[,„],e),CTj|gc,K3,/92,e (n;5,e),(7^|gc, K3,e,I^ 
• Case E.4: 

J. ^ n— 1 

■ Then (02, r{), (72, K2, P2, memo e — > (n2 • Mp,,e, (72, K2, P2, e — > (H^, T{'), ct^, K2, £, 

■ From 112 ok we have that 112 ■ Mp^ e ok 

■By Lemma and (c) we have (dropg(n^), _); e O* (dropB(n2-Mp2,e), -);5^ O (dropg(n2), _) ; with 
n = Mp„e-r2 
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■ By induction then: 

(i) (n^,T{') ok 

t * 

(ii) (n3-Mp,,e,£),cr2|gc,K3,p2,e > {H'^, s) , a'^lgc, ^3, £ , 1^2 

(iii) {Il'3,e};eO* (n3-Mp,,e,e);T^ O (n3,e) 

■ Hence, (Ha, e), cr2|gc, ^3,^2, memo e (n3-Mp2^e,£),CT2|gc, t3,P2,e {n'^,e),a'2\gc,K3,£,'^ 
• Case E.5: 



(n^,T{'),(T^,K2,£,J^2 



■ Then (Ha, r{),f72,K2,P2, update e ^ (n2-Up„e,r{) , cr2,K2,P2,e ! 

■ From 112 ok we have that 112 ■ ^P2,e ok 

■By Lemma and (c) we have (dropg(n^), e O* (dropg(n2- Up^^e), -) ; J2 O (dropg(n2), _) ; with 

^2 — Up2,e-T2 

■ By induction then: 

(i) (n'2,T{') ok 

(ii) (n3-Up2,e,£),Cr2|gc,K3,p2,e ^ (11^ , e) , (73 |gc, ^3 , £, 1^2 

(iii) (n!„e);eO* (03 • Up,,e, e); O (Hg, e) ; 

■ Hence, (Ha, e), CT2|gc, K3 , P2 , update e (n3-Up2,e,e),cr2|gc, «:3,p2,e (njj, e), cr^lgc K3,e,i^ 
• Case E.6: 

^ ^ n— 1 

■ Then {U2,T{),a2, K2,P2,push / do e — > (112 •□, r{), 0-2, 1^2- 

• Note that a lso (n 3 , e) , (72 1 gc , ^ 3 , P2 , push / do e (Hg • □ , e) , ctz I gc , « 3 • [p2 , / J , P2 , e. 

■ By Lemma A. 12 the n ~ 1 reduction above decomposes as follows: 

- {Il2-0,T{),a2,K2-[P2j\,P2,e !■ {n2,T{),a2,K2-[p2,f\,£,'^ 

- {Il2,T{);e(3* {n2-a,T{);T _ 

- (n2, T{) , ctJ, K2 • [p2, /J , £, (n2 ■{T),T{), ctJ, K2 , P2, 6/ 

' — ' t 

- (n2-(T),r{),CTj,K2,p2,e/ — > (n^,T{'),cr^,K2,e,i^ 

- n = ni + 1 + ^2 

■ From (n2, r{) ok we get (n2 •□, T^) ok. 



From (n2,T{);e(3* {Il2-n,T{);T follows (dropg (n2 ) , _) ; e O* (d ropg (n2 •□),_); T by LemmajA^ 
Hence induction with ri i yields: 

- {^2,T{) ok 

t * 

- (n3-n,£),cr2|gc,K3-LP2,/J,P2,e > (Hg , e) , ctJ |gc, K3 ' LP2 , /J , £, 

- (n'3',£);£0* (n3-n,e);T 

Note that (n'3', e) , o^lgc, ^3 ■ LP2, /J , £, (n3 • (T), e), o^|gc, K3, /5^2, e/. 



(Hz • (T), r|) ok follows from (02, r{) ok by Lemma A.2 



■ By Lemma|A.8|and (c) we have (dropg(n^), _); e O* (dropg(n2-(r)), _); O (dropg(n2), _) ; with = 
{T)-T2 

• So induction with n2 yields: 

- (n^,T{') ok 

- (n3-(T),e),0^|gc,K3,P2,e/ (nJj,£),CT2|gc,K3,£,I?^ 

- (n^ , e) ; e O* (n3 • (T) , £) ; O (n3 , e) ; T^^ 

■ Finally, (Eg, e), cr2|gc, '«3, P2,push / do e — (Eg, e), CT2lgc, '«3, J^by putting the pieces together. 
• Case E.7 

_ (. _ _ ^ n— 1 

■ Then (E2, r{), CT2, K2, P2, pop x — > (E2 -F, T{), f72, ^2, e, — ^ (E'2, r{'), CTj, K2,£, 
Where: ^2(3^^)!,^' = V 

• From E2 o k we have that E2 -V ok 

■ By Lemmal^sland (c) we have (dropg(E^), _); £ O* (dropg(E2 -17), _); 5^ O (dropg(E2), _) ; with = V-f^ 

• By induction then: 

(i) (E^,TO ok 

_ _ t * 

(ii) (n3-i',e),cr2|gc,K3,e,i' — > (n!;, e), a-2lgc, ^3, e, 1^2 
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(iii) {Il's,s);sO* {Il3-i7,e);T2 {Tl3,e);n 

■ Hence, (Hg, e), crjlgc, ^3, p2, pop ^ (Hg -17, e) , dalgc, ^3, (Hg, e), (Tslgc, K3, 

• Case E.8: We show that this case does not arise. 

■ Then 

^ _ (. - — ^ ^ n— 1 

- (n2 , T{ ) , (72 , K2 • [p, /J , e , — ^ (n2 • (Ts ) , T{ ) , (72 , K2 , - — > (H^ , T{') , (7^ , K2 ' [p, /J , £, 

- (dropg(n2),T{);£0* (dropg(n^.n),T^);r3 

■ By Lemmas |A.7| and p\.4| we get both 

- dropg(n2-n) e Prefixes(dropg(n2)) 

- dropg(n^-(r3)) e Prefixes(dropH(n'2)). 

■ This is a contradiction and thus rules out this case. 

• Case P.8: We show that this case does not arise. 

■ Then 

t ^ ri— 1 

- (n2,T{),(72,K2,p2,ar2 > (112 ' (T4) , r3) , (72 , K2 , prOp > (II'j , Tf) , (7^ , K2 , £, 1^ 

- (dropg(n2),e);eO* (dropg(n;,.fflT3),£);T4 

■ By Lemma s ^.7| and p\.4| we get both 

- dropg(n2-fflT3) e Prefixes(dropg(n2)) 

- drapg(fi^-(r4)) e Prefixes(dropg(n'2)). 

■ This is a contradiction and thus rules out this case. 

• Case E.P 

^ . — ^ ^ n — l 

■ThenT{ = Mp^^^-Ti and (02, (72, K2, P2, memo e ^ (02 • Mp.^e, Ti) , o-2,K2,p2,e — > {n'^,T{'),a'2,K 2,e, i^2 
■Hence (dropg(n!j), _); e (3* (dropg(n2)-Mp,,e, -);T^ O (dropg(n2), _) ; with = Mp^e -5^ by Lemma 
and (c). 

■ From (b) we have that Mp^^e •T'l ok, and by inversion we have that Mp^.e -Ti fsc. 

■ Hence, by Lemma A. 17 we know there exists some components Hi, (7i, ki, pi, a^i such that 

- (Hi, e), (7i, Ki, pi, ari — — > (Hi, e), (7i, Ki, /92, memo e using E.O only 

- Ti fsc from (fli • Mp^.e, e), d, ^i, p2, e 

■ 1X2- Mp2,e ok follows from 112 ok. 

■ Induction and part (1) then yield: 

(i) (n^,T{') ok 

(ii) (n3-Mp2,e,e),(72|gc,'«3,p2,e — > (n;j, e), cr^lgc, K3, e, 1^2 

(iii) {U'3,e);eO* {II^-M, ,,,e);f2 O {Il3,e) 



A.8 



t 

Finally, using Lemma |A3| (113, e), (72|gc, K3, Pi, an — > (Ha- Mp^.e, s), o-2|gc, K3, p2, e. 

• Case U.l 

( - — - ( n — 1 

■ Then (112, A^^^ -T^), (72, 't2,P2,ar2 > (112, T{), (72[^ o], '«2,P2,ar2 — > (n2,T{'),(7^,K2,e,l^ 

■ By inversion on (b) we have that ki m-T[ fsc 

■ By Lemma A. 17 we have that T{ fsc 

■ Hence, t{ ok 

■ Induction yields: 

(i) (n^,T{') ok 

(ii) (n3,e),Cr2[Oh^€]|gc,K3,p2,ar2 — > (n3,e),(72|gc,K3,e,i^ 

(iii) {n'^.e)-eO* (n3,e);T^ 

■ By definition, (72 ^ o]|gc ~ <^2\gc 

t * 

■ Hence, (Ilg, e), (72|gc, K3, P2, ar2 — > (Ilg, e) , (72|gc, K3, e, 1^2 

• Case U.2 

( ^ ri — 1 

■ Then (112, i-T{), (72, K2, P2, ar2 — > {^2-,T[) ,02, k.2, 92,^^2 — > ^(Ila, , cr^, ^2, £, 

■ By inversion on (b), with the knowledge that t 7^ (_), we have that t-T[ fsc 

■ By Lemma A. 17 we have that T[ fsc 
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■ Hence, T{ ok 

■ The claim then follows by induction. 
Case U.3 

^ ( Tl— 1 

■Then (112, (T{) -T^), (72, K2, P2, ar2^ > (112^5^, r{), CT2, K2, P2, ar2 — > (n2,T{'),(T2,K2,e,i^ 

■ By inversion of (b) we show that T[ ok and T2 ok: 
- Subcase: if ok and 7^ ok. 

• Immediate. 
Subcase: (T^)- T|fsc . 

we have T[ fsc and fsc. 



A.17 



• From Lemma 

• The claim then follows immediately. 

■ Hence from (b), we have 112 •B;p ok 

■ Note that dropg(n2-B=7) = dropg(n2) 

-'2 

■ Hence, from (c) we have (dropg(n2), _); e O* (dropR(n2 •Bs;7), _); To 

^ 2 

■ The claim then follows by induction. 
• Case U.4 

( - — ^ ^ ^ n—l 

■ Then (n2-B^,e),tT2, t2^P2,ar2 — > (112, T), CT2, K2, P2, ar2 — > {n'2,T{'),a'2,K2,e,T^ 
• From (b) we have both 112 ok and T ok 

■ Note that dropg(ri2- By) = dropg(ri2) 

■ Hence, from (c) we have (dropg(n2), _); e O* (dropg(n2), -); ^2 

■ The claim then follows by induction. 

□ 

Definition A.9 (Big-step Sugar). 

{e,T),(7,e,p,ar ^ (H, e), cr', e, e, (e, T), cr, e, e, prop (n,e),a',e,e,u 

{U,e);e(3* {e,e);T' (U,s);e O* {e,e)-r 

BIGEVAL ; ; BIGPROP 



r, CT, p, ttr J| T', cr', 17 T,ar\T',a',V 
Corollary (Big-step Consistency). Suppose e,(Ti, pi,ari -IJ. Ti , cr'j^ , 17]^ and CSA{<Ji, pi, a^i). 

1. IfTi,a2,P2,ar2 -D- T2,(t'2,V2 then e, Cr2|gc, P2, ar2 4 T2,Cr2|gc,I^2 

2. IfTi,a2 r\ r2, 0-2,^^2 then e, CT2|gc, Pi, an -D- ?2,a2|gc,J^2 



Proof. Immediate corollary of Theorem A. 22 □ 
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B. Proofs for DPS Conversion 

In this section, let denote the auxiUary function that is used in the DPS translation of a push command (where n = 
Arity(/)): 

= (fun /'(y). update 

let 2/1 = read(y[l]) in ••• 
let y-a = read(?;[n]) in 

f (yi,---,yn,x)) 

Furthermore, we write FF(X) to denote the set of fimction names free in the syntactic object X. 
B.l DPS Conversion Preserves Extensional Semantics 
Definition B.l. 



TlieoremB.l. If 

r * I _ 

• <Jl,Ki,pi,e (Ti,£,£,I/ 

• dom(cr2) = 

dom(ai) 

• K\ r^^^^ K2 

• [Pll C P2 
» P2{X)=£ 

then ai ttl cr2, K2, P2, Ie]x — > cr'i W (T2, £, £, £' where I' = head(f @ £) and I'^i^' , i) = Vifor all i. 
Proof. By induction on the length of the reduction chain. 

• Case 



e = let ftm f(z).ei in 62 



■ Then ui , ki , pi , e -^^ cti , ki , pi , 62 , £, e, v, where p{ = pi[f ^ fun fiz).ei]. 

■ We know cti i+) (T2, K2, P2, Mx cri i+) ^2, ^2,^2' Nix. where P2 = P2[/ 1-^ ftm /(2@t/).[eily]. 

■ It is easy to see that |pi] C p'^ follows from \pi\ C p2. 
• The claim then follows by induction. 

Case 



e = if a; then ei else 62 



■ Suppose pi {x) = (the other case is analogous). 
■Then(7i,Ki,pi,e-H>cri,Ki,pi,ei a[,£,e,V. 
• [Pi] C p2 implies p2(a;) = 0. 

■ Hence we know ai ttl 0-2, K2, P2, [ejx — ^ cri ttl 0-2, K2, P2, [ei] 

■ The claim then follows by induction. 

Case 



e = f (z) 



■ Then cti, Ki,pi,e ai,Ki,p[,ef (Ti,e,£,i^, where pi = pi[yi H> Pil^i)] •t^f''^"''' andpi(/) = fun/(y).e/. 

■ From [pi] C p2 we know p2(/) = fun /(y @a;).[e/lx. 

■ Hence ai W a2, K2, P2, [e]x tri ttl (72, k;2, P2, {efjx, where P2 = p2[yi p2(-2i)]l=f''^''-'. 

■ It is easy to see that |pi] C p2 follows from |pi] C p2. 

■ The claim then follows by induction. 
• Case 



e = let y = t in e' : 



' Then ai, ki, pi,e — ^ a'{ , ki , pi , e' — ^ ui , £, £, i/, where pi = pi [y 1— >■ i^'] and tJi , pi , t — ^ fii' , v'. 

' Since dom(a-2) = {I, £} and ^ ^ dom(cri) D dom(cri') and |pi] C p2, we get ai ttl <T2, P2, (- ci' ttl a2, f'. 

' Hence we know (Ji W cr2, ^2, P2, [eja; cr" tt) (72, K2, P2, [e'lx. where p'2 = p2[y v']. 

' It is easy to see that |pi] C p'^ follows from |pi| C p2. 
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■ The claim then follows by induction. 

• Case e = push / do e' : 

■ Then cti , ki , pi , e — ^ (Ji , k'^ ,pi,e' — ^ a'i,e,e^V, where n'-^ = k\ ■ [pi , /J . 

■ We know cr2 , ^2 , , [e] 0-2 > «2 > P'2 > [e'| x' , where 

- = C72[(f ,i) ^ A-Ti=v so dom(a^) = } 

- f ^ dom(cr2) Udom((Ti) 

- K2 = K2-LP/,/'J 

-p'2=p'/[a;'^f] 

■ We show k'i r..^®^^^®^ 4: 

- K\ K2 is given. 

- [pi] C p'^ follows from |pi] C p2. 

- p'f{x) = I follows from P2(a;) = i. 

- p'fif) = is obvious. 

■ Also, |pi| C P2 follows from |pi] C p2 
■Finally, p^(x') =£'. 

■ The claim then follows by induction (note that head(?@^) = head(?@^@/)) 

• Case e = pop z 



Case 



and Ki = e: 

Then cti , ki , pi , e — ^ (Ji,e,s,T' and (t[ = ai and t'j = pi (2:j) for all i. 
From Ki ^^"^^ K2 we get K2 = £ and Z = e. 

Thus we know cti l±) 0-2, K2, P2, |e]a; — ^ cti 1+) CTj) £)-^> where CTj — a-2[{i, i) i-> P2(-Zi)]'^^f 
Note that cr^ii, i) — P2(^i) ~ Pi{zi) = vi, for any i. 
Finally, note that I = head(£) = head{£@l). 
andKi = Ki-[p/,/J: 



e = pop z 



■ Then ai,Ki,pi,e ^ ai,K{,p{, e/ ai, e, e, i^, where pi = pf[yi H> pi(zi)]^^^f''^'''' and p/(/) = fun f(y).ef. 
• From Ki /t2 we know 

- X = x' @ a;y- and 1 = 1 @£f 

- [Pfl C p'^. A p^(a;j) = £j 

- p'fin = 

■ Therefore CTi l+l CT2, ^2, P2, 14^ cti 1+) ct^, kj, £, i, where CT2 = CT2[(^, i) ^ P2iz^)]^i2f^^''^ ■ 

■ And 0-1 ttl £72 , K2 , £, (71 ttl , K2 , P2, / (yi , . . . , , a;/ ) , where P2 = p'f [v ^ ^] ^ P2 (^^i)] j=f''^^'' • 

■ AndcTi l±)(72:K27/'2,/ ^ ^Tl W (^2 i 4 ) P2 ) [e/lxj, . 

■ Note that |pi] C p'2 follows from {pfj C p'^ and [pi] C p2. 

■ The claim then follows by induction. 

□ 

Corollary. Ifai,e,p,e — ^ ai , e, e, then ai , £, |p| , let x = alloc (n) in lejx — >■ ai tt) aj , £, £, ^ with a'^ = I'i for 
all i. 

B.2 DPS Conversion Produces CSA Programs 
Definition B.2. 

p' oc p ^ Ip'l C pA V/ e dom(p). FF(p(/)) C dom(p') 

Definition B.3. 



£^ 

p'fjxf) =£fA p'fif) = Py A 3pi. pi cx p) 
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Lemma B.2. If 

1. (T, K, p, \e\x cr', k' , p' , update e' 

2. 3pi. pi (X p A FF(|e]a;) C dom(/9i) 

4. p{x) = e 

then: 

• P'{y) = i' 

• 3p2. /02 a /o' A FF(|e"]y) C dom(p2) 

Proof. By induction on the length of the reduction chain in (1). 



Case 



e = let fun /(z).ei in 62 



■ Then a, k, p, {ej^ a, k, p, [eaja; - 

■ Note that (2) has been preserved. 

■ The claim then follows by induction. 

Case 



a', k', p', update e', where p = p[f ^ fun J{z @ y).[ei]j,] 



if X then ei else 62 



■ Suppose p{x) = (the other case is analogous). 

■ Then a, k, p, {ej^ a, k, p, {eij^ ct', «;', p', update e'. 

• Note that (2) has been preserved. 

■ The claim then follows by induction. 

• Case e = f iz) : 

■ From (2) we know p{f) = fun f{y@x).lefjx. 

• Hence a, k, p, |e]^ a, k, p, [efj^ 

• Note that (2) has been preserved. 

■ The claim then follows by induction. 

• Case e = let 2; = i in e : 

■ Then a, k, p, [e]^ a, k, p, {ej^ - 

• Note that (2) has been preserved. 

■ The claim then follows by induction. 

• Case e = memo e 



a', k', p' , e', where p = p[yi ^ p{zi)]'^^f^^^\ 



• Then a, k, p, {ej^ a, k, p, {ej^ - 

• Note that (2) has been preserved. 

■ The claim then follows by induction. 
Case e = update e 



a', k', p', update e', where p = p[z u]. 



a',K', p', update e'. 



• If the length of the reduction is 0, then e' = |e]x and we are done. 

■ Otherwise we know a, k, p, {ej^ f, k, p, lejx c', p', update e' 

■ Note that (2) has been preserved. 

■ The claim then follows by induction. 
Case 



e = push / do e : 

r * 

• Then cr, K, p, fej^ — > cr, K, p'^, push /' do memo let z = alloc (n) in [ejx, where p^ = p[/' ^ D^] 

• Now a, K, py., push /' do memo let z = alloc(n) in pj^ — > a, k, p, [e]^, where: 

- k = K-lp'fJ'\ 

-p = p'^[z^e'] 

• Note that |pi] C p A FF([e]2) C dom(pi) A V/ e dom(p). FF(p(/)) C dom(pi) follows from (2). 

■ Furthermore we know a, R, p, [ej^ — > o-',k', p', update e'. 

■ The claim thus follows by induction if we can show k'^. 

• And yes, we can! 
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Case 

Case 



e = pop z and k = e: impossible due to (1) 

e = pop z and k = k- [pj, /'J : 



■ From k'^ we know: 

1. b2] ^p'f^yge dom(p}). ffip'fig)) C dom(p2) 

2. = 
3. 

• Hence we know (/) ~ fun f{y @ x /). |e f\x; ■ 

• So a,K,p,{e\x ^ a,k,p,f(yi,...,yn,Xf)^ ct, k, p, [e/l^;^, where: 

- p = p'^[y^i][y^^p{z,)]^;:f(^^ 

• Note that [pa] C p^ A Vg € dom(p'^). FF(p'^(5f)) C dom(p2) impUes [pal C p A FF(|e/l^^) C dom(p2) A V/ e 
dom(p). FF(p(/)) C dom(p2). 

■ Furthermore we know a, k, p, [e/Ja;^, — ^ a', k', p', update e' and the claim thus follows by induction. 



Definition B.4. 



K £' p'fjxf) = if A p'fif) = Py A 3pi. pi (X p'^ 

K-[p'f,r\ >n' 

Lemma B.3. If 

1. K t>^ £' 

2. p{x) = i 

5. 3pi. pi (X p A FFde]^:) C dom(pi) 
4. (J,K,p, lej^ -H> a',e,e,V 

then V = £' . 

Proof. By induction on the length of the reduction chain. 
• Case 



; = let fun f{z).e\ in 



• Then a, k, p, {ej^ a, k, p' , |e2]a: ^ £, e, i^, where p' = p[/ ^ fun /(^ @ 2/).Iei]j,]. 

■ Note that (3) has been preserved. 

■ The claim then follows by induction. 
• Case 



; = if .T then ei else 62 



' Suppose p{x) — (the other case is analogous). 
■ Then a, k, p, [ej^ a, k, p, |ei]a; a',e, e, v. 
• The claim then follows by induction. 



• Case 



e = / (z) 



From (3) we know p(/) = fun f(y@x').lefjx'- 

-J> a', £, £, V, where p' = p[y^ i-)> p{zi)]f2f'''^''^ ■ 



• Hence a, k, p, (ej^ a, k, p', |e/] 

■ Note that (3) has been preserved. 

■ The claim then follows by induction. 
' Case 



e = let y = t in e' 



cr',K,p', le% 



• Then a, k, p, |e 

■ Note that (3) has been preserved 

■ The claim then follows by induction 
' Case 



a', £, £, v, where p' = p[y 1-^ p']. 



e = push / do e' 

■ Then ct, k, p, (ej^ -H> a", k', p', {6%^ a', s, e, V, where 
- «' = «.[p'^,fj 



□ 
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- p'f = pir ^ Dj] 

- p' = p'f[xf^ ^f] 

We show k' 

- C p'j.. A V.g G dom(p'^). FF(p^(5)) C dom(pi) follows from (3). 

— p'f if) — is obvious. 

— p'fix) — I follows from pix) = ^. 

— II! is given. 

Note that (3) has been preserved. 
The claim then follows by induction. 



Case 



pop z 



and K = e: 



Case 



Then a, k, p, lejx — > c', e, £, £ and thus V — 
From K I' we know 1 = ^ . 

-~ pop z and k = kI ■ [p'^ , /' J : 



■ From K [> ^' we know 

1. 3pi. C p)h^g e dom(p'^). FF(p^(.9)) C dom(pi) 

2. = 

3. p){xf)^if 

4. k' f 

■ So cr,K,p, |el^ cr",K,e,^ ^* a",K',p'J (yi, ...,?/„, x/), where p' = 1-^- £][y, i-> ■^^^f'''"'''. 

■ From (2) and (1) we know p'{f ) fun /(y @a;/).|e/]j;j.. 

■ Thus (T",K',p',f (yi, . . .,yn,Xf) a" , k' , p' , {efjxf ^ a',e,e,V. 
• The claim then follows by induction. 

□ 

Theorem B.4. CSA(cr, let x = alloc (n) in |e]:r) 

Proof. Suppose a, e, let x = alloc(n) in lejx c', n, p', e'. We must show SA(cr', p', e'). We distinguish two cases: 

• Case TO = 0: 

r * 

■ So suppose cr', £, [p],let x = alloc (n) in fej^r — > a", k', p", update e". 

■ Hence t7'[(£,z) ^ Mti^e, {pjix ^ £], |e], ^* a", k', p", update e". 

■ Since e<^, Lemma|R2]yields: 

-p"(y) = f 

- Ip2] C p" a FFdel,,) C dom(p2) A V/ G dom(p"). FF(p"(/)) C dom(p2) 

■ Now suppose _, £, p", lejy — ^ _, e,e,I7. 

■ Since e t>^ i!', Lemma B.3 yields V ^ £'. 

• Case TO > 0: 

■ Then a[{e, i) ^ ±]ti,£, M ^ l^h P', e'- 

■ So suppose Q-', e, p', e' ct", k', p", update e". 

■ By Lemma A.IO cr', k, p', e' — cr", p", update e". 



Hence a[{e,i) ^ -L]-Li,e, Ipl[a; ^ cr", p", update e". 

The rest goes as in the first case. 



□ 
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C. Cost Semantics Proofs 

Lemma C.l. //(n, e) ; e O* (H'-D, T2) ; Ti and B^Ii, then T2 = £. 
Theorem C.l. If 

• CT, K, p, e — _, e, e, Vi, described by sY 

( * _ 

• (n, e) , cr, K, p, e — > £, e, V2, described by S2 

• ffl,B ^ n 

• 7s si = 7s S2 

• 7o- si = 7cr S2 

• 7k Si = 7k 

Proof. By induction on the length of si. Note that si = () is not possible, so si = s*^ :: S3. We analyze s" and in each case 
observe that S2 must start with the step s* that is associated with the corresponding E rule. Note that: 

• Rules E.P and P.E never apply because the reuse trace is empty. 

• Rules P.1-7 never apply because prop is not an expression. 

• Rule P.8 never applies because its premise would imply ffl e H. 

• Rules U.1-3 never apply because the reuse trace is empty. 

• Rule U.4 never applies because B ^ 11. 

In each case, we find that s* has the same cost as s" in the three models, i.e., 7 s" = 7 s* for 7 e {7s, 7cr, 7k}- Also, each 



step preserves the assumptions. In particular, in rule E.8, T2 = e implies = £ by Lemma C.l 



□ 

Corollary. If 

r * _ 

• a, e, p, e — > e, e, vi, described by si 

• {e, e) , a, e, p, e — >■ _, _,£,£, 1^2, described by S2 
then: 

• 7s si Os = 7s S2 Os 

• 7o- si Oct = 7^ s^ Oct 

• 7k si 0^ = 7« s^ 0« 

Theorem C.3. Suppose the following: 

• CTi , £, p, e — !■ Oy , £, £, V, described by s\ 

• CTi, £, let X = alloc (n) in \e\x — ^ o'^ l±) cr2, £, £, ^, described by Saiioc ■'■^ 

• {u, d, _, _) = 7k si Ok 

• (ai,ri,w;i) 7^ si O^- 

• {a2,r2,W2) = 7(T s^ Oct 

• N is the maximum arity of any pop taken in si 
Then: 

1. 7k si Ok = 7„ S^ Ok 

2. a2 — ai ~ u 

3. r2 — ri < N * d 

4. W2 - wi < N * {d + I) 

5. 7s s^ Os - 7s si Os < (2A^ + 5) * M + iV 

Proof. Informally, this is easy to see from the definition of DPS conversion as explained below. The only interesting cases are 
pushs and pops. Note that since both computations start and end with an empty stack, we know u = d. 

1. Observe that the conversion preserves the number and order of pushs and pops and that both computations end in an empty 
stack. 
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2. Observe that the translation of a push introduces a single additional alloc. 

3. Observe that the translation of a push introduces a function containing at most N additional reads. Each time the stack is 
popped, such a function is executed. This happens d times. 

4. Observe that the translation of a pop introduces at most N additional writes. We know that d + 1 pops are executed (the 
last one when the stack is already empty, thereby terminating the program). 

5. Observe that: executing the translation of a push takes 3 additional steps (function definition, memo, alloc) to reach its 
body; executing the translation of a pop (of which d + 1 are executed) takes at most N steps before actually doing the pop; 
in the d cases where the stack is popped, the function generated by the corresponding push is executed, which takes at most 
1 + + 1 steps. In total, this adds up to 3 * u + * (d + 1) + (1 + + 1) * d (2iV + 5) *u + N additional steps. 

Formally, this can be proven by a very tedious induction, similar to — ^but much more space consuming than — the proof of 
Theorem lO □ 
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